Remote Access Transactions of the American Mathematical Society
Green Open Access

Transactions of the American Mathematical Society

ISSN 1088-6850(online) ISSN 0002-9947(print)



Phase transitions in phylogeny

Author: Elchanan Mossel
Journal: Trans. Amer. Math. Soc. 356 (2004), 2379-2404
MSC (2000): Primary 60K35, 92D15; Secondary 60J85, 82B26
Published electronically: October 28, 2003
MathSciNet review: 2048522
Full-text PDF

Abstract | References | Similar Articles | Additional Information

Abstract: We apply the theory of Markov random fields on trees to derive a phase transition in the number of samples needed in order to reconstruct phylogenies.

We consider the Cavender-Farris-Neyman model of evolution on trees, where all the inner nodes have degree at least $3$, and the net transition on each edge is bounded by $\epsilon$. Motivated by a conjecture by M. Steel, we show that if $2 (1 - 2 \epsilon)^2 > 1$, then for balanced trees, the topology of the underlying tree, having $n$ leaves, can be reconstructed from $O(\log n)$ samples (characters) at the leaves. On the other hand, we show that if $2 (1 - 2 \epsilon)^2 < 1$, then there exist topologies which require at least $n^{\Omega(1)}$ samples for reconstruction.

Our results are the first rigorous results to establish the role of phase transitions for Markov random fields on trees, as studied in probability, statistical physics and information theory, for the study of phylogenies in mathematical biology.

References [Enhancements On Off] (What's this?)

  • 1. N. Alon and J. H. Spencer (2000) The probabilistic method, second edition. With an appendix by P. Erdös, John Wiley and Sons. MR 2003f:60003
  • 2. A. Ambainis, R. Depser, M. Farach-Colton and S. Kannan (1999) Tight bounds on learnability of evolution, preprint.
  • 3. S. Bezrukov (1994) Isoperimetric problems in discrete spaces, in Extremal Problems for Finite Sets., Bolyai Soc. Math. Stud. 3, 59-91, P. Frankl, Z. Fredi, G. Katona, D. Miklos eds. MR 96c:05181
  • 4. P. M. Bleher, J. Ruiz and V. A. Zagrebnov (1995) On the purity of limiting Gibbs state for the Ising model on the Bethe lattice, J. Stat. Phys 79, 473-482. MR 96d:82009
  • 5. J. A. Cavender (1978) Taxonomy with confidence, Math. Biosci., 40, 271-280. MR 58:20548
  • 6. T. M. Cover and J. A. Thomas (1991) Elements of Information Theory, John Wiley and Sons. MR 92g:94001
  • 7. M. Cryan., L. A. Goldberg and P. W. Goldberg (1998) ``Evolutionary Trees can be Learned in Polynomial Time in the Two-State General Markov Model"; SIAM Journal on Computing, 31 (2001), 375-397. MR 2002h:68090
  • 8. P. L. Erdös, M. A. Steel, L.A. Székely and T. Warnow (1999) A few logs suffice to build (almost) all trees (Part 1). RSA, 14(2), 153-184. MR 2000b:92003
  • 9. P. L. Erdös, M. A. Steel, L.A. Székely and T. Warnow (1999) A few logs suffice to build (almost) all trees (Part 2), Theor. Comput. Sci., 221, 77-118. MR 2000k:92015
  • 10. W. Evans, C. Kenyon, Y. Peres and L. J. Schulman (2000) Broadcasting on trees and the Ising Model, Ann. Appl. Prob., 10, 410-433. MR 2001g:60243
  • 11. J. S. Farris (1973) A probability model for inferring evolutionary trees, Syst. Zool., 22, 250-256.
  • 12. P. Frankl and Z. Fiiredi (1981) A short proof for a theorem of Harper about Hamming spheres, Discrete Math. 34, 311-313. MR 83a:05004
  • 13. H. O. Georgii, (1988) Gibbs measures and phase transitions, de Gruyter Studies in Mathematics, 9. Walter de Gruyter and Co., Berlin. MR 89k:82010
  • 14. G. Grimmett (1999) Percolation, Second edition. Springer-Verlag, Berlin. MR 2001a:60114
  • 15. L. H. Harper (1966) Optimal numberings and isoperimetric problems on graphs. J. Combinatorial Theory 1, 385-393. MR 34:91
  • 16. Y. Higuchi (1977). Remarks on the limiting Gibbs state on a $(d+1)$-tree. Publ. RIMS Kyoto Univ. 13, 335-348. MR 58:32697
  • 17. D. Ioffe (1996). A note on the extremality of the disordered state for the Ising model on the Bethe lattice. Lett. Math. Phys. 37, 137-143. MR 97e:82004
  • 18. S. Janson and E. Mossel (2003). Robust reconstruction on trees is determined by the second eigenvalue, submitted.
  • 19. C. Kenyon, E. Mossel and Y. Peres (2001). Glauber dynamics on trees and hyperbolic graphs, 42nd IEEE Sympos. Foundations of Computer Science (Las Vegas, 2001), pp. 568-578. MR 2003h:68005.
  • 20. H. Kesten and B. P. Stigum (1966) Additional limit theorem for indecomposable multidimensional Galton-Watson processes, Ann. Math. Statist. 37, 1463-1481. MR 34:864
  • 21. T. M. Liggett (1985) Interacting particle systems. Fundamental Principles of Mathematical Sciences, 276. Springer-Verlag, New York. MR 86e:60089
  • 22. R. Lyons (1989) The Ising model and percolation on trees and tree-like graphs. Commun. Math. Phys. 125,337-353. MR 90h:82046
  • 23. E. Mossel (1998) Recursive reconstruction on periodic trees, Random Structures and Algorithms 13, 81-97. MR 99h:05106
  • 24. E. Mossel (2001) Reconstruction on trees: Beating the second eigenvalue, Ann. Appl. Probab., 11, 285-300. MR 2003d:90010
  • 25. E. Mossel (2003) On the impossibility of reconstructing ancestral data and phylogenetic trees, Jour. Comput. Biol., to appear.
  • 26. E. Mossel and Y. Peres (2003) Information flow on trees, Ann. Appl. Probab., to appear.
  • 27. E. Mossel and M. A. Steel (2003) A phase transition for a random cluster model on phylogenetic trees, submitted.
  • 28. J. Neyman (1971) Molecular studies of evolution: a source of novel statistical problems. In Statistical decision theory and related topics, S.S Gupta and J. Yackel (eds), Academic Press, New York, 1-27. MR 48:5663
  • 29. Y. Peres (1999) Probability on trees: an introductory climb, in Lectures on probability theory and statistics (Saint-Flour, 1997), 193-280, Lecture Notes in Math., 1717, Springer, Berlin. MR 2001c:60139
  • 30. F. Spitzer (1975) Markov random fields on an infinite tree, Ann. Probab. 3, 387-394. MR 51:14321
  • 31. M. A. Steel (1994) Recovering a tree from the leaf colourations it generates under a Markov model, App. Math. Lett., 7(2), 19-24.
  • 32. M.A. Steel, L.A. Székely and M. D. Hendy (1994). Reconstructing trees when sequence sites evolve at variable rates, Jour. Comput. Biol., 1, 153-163.
  • 33. M. A. Steel (2001), My favorite conjecture,$\sim$mathmas/ conjecture.pdf.

Similar Articles

Retrieve articles in Transactions of the American Mathematical Society with MSC (2000): 60K35, 92D15, 60J85, 82B26

Retrieve articles in all journals with MSC (2000): 60K35, 92D15, 60J85, 82B26

Additional Information

Elchanan Mossel
Affiliation: Department of Statistics, Evans Hall, University of California, Berkeley, California 94720-3860

Keywords: Phylogeny, phase transition, Ising model
Received by editor(s): May 21, 2002
Received by editor(s) in revised form: April 10, 2003
Published electronically: October 28, 2003
Additional Notes: The author was supported by a Miller Fellowship. Most of the research reported here was conducted while the author was a PostDoc in theory group, Microsoft Research.
Article copyright: © Copyright 2003 American Mathematical Society

American Mathematical Society