Phase transitions in phylogeny

Author:
Elchanan Mossel

Journal:
Trans. Amer. Math. Soc. **356** (2004), 2379-2404

MSC (2000):
Primary 60K35, 92D15; Secondary 60J85, 82B26

DOI:
https://doi.org/10.1090/S0002-9947-03-03382-8

Published electronically:
October 28, 2003

MathSciNet review:
2048522

Full-text PDF

Abstract | References | Similar Articles | Additional Information

Abstract: We apply the theory of Markov random fields on trees to derive a phase transition in the number of samples needed in order to reconstruct phylogenies.

We consider the Cavender-Farris-Neyman model of evolution on trees, where all the inner nodes have degree at least , and the net transition on each edge is bounded by . Motivated by a conjecture by M. Steel, we show that if , then for balanced trees, the topology of the underlying tree, having leaves, can be reconstructed from samples (characters) at the leaves. On the other hand, we show that if , then there exist topologies which require at least samples for reconstruction.

Our results are the first rigorous results to establish the role of phase transitions for Markov random fields on trees, as studied in probability, statistical physics and information theory, for the study of phylogenies in mathematical biology.

**1.**N. Alon and J. H. Spencer (2000)*The probabilistic method*, second edition. With an appendix by P. Erdös, John Wiley and Sons. MR**2003f:60003****2.**A. Ambainis, R. Depser, M. Farach-Colton and S. Kannan (1999) Tight bounds on learnability of evolution, preprint.**3.**S. Bezrukov (1994) Isoperimetric problems in discrete spaces, in*Extremal Problems for Finite Sets.*, Bolyai Soc. Math. Stud.**3**, 59-91, P. Frankl, Z. Fredi, G. Katona, D. Miklos eds. MR**96c:05181****4.**P. M. Bleher, J. Ruiz and V. A. Zagrebnov (1995) On the purity of limiting Gibbs state for the Ising model on the Bethe lattice,*J. Stat. Phys***79**, 473-482. MR**96d:82009****5.**J. A. Cavender (1978) Taxonomy with confidence,*Math. Biosci.*,**40**, 271-280. MR**58:20548****6.**T. M. Cover and J. A. Thomas (1991)*Elements of Information Theory*, John Wiley and Sons. MR**92g:94001****7.**M. Cryan., L. A. Goldberg and P. W. Goldberg (1998) ``Evolutionary Trees can be Learned in Polynomial Time in the Two-State General Markov Model";*SIAM Journal on Computing*,**31**(2001), 375-397. MR**2002h:68090****8.**P. L. Erdös, M. A. Steel, L.A. Székely and T. Warnow (1999) A few logs suffice to build (almost) all trees (Part 1).*RSA*,**14(2)**, 153-184. MR**2000b:92003****9.**P. L. Erdös, M. A. Steel, L.A. Székely and T. Warnow (1999) A few logs suffice to build (almost) all trees (Part 2),*Theor. Comput. Sci.*,**221**, 77-118. MR**2000k:92015****10.**W. Evans, C. Kenyon, Y. Peres and L. J. Schulman (2000) Broadcasting on trees and the Ising Model,*Ann. Appl. Prob.*,**10**, 410-433. MR**2001g:60243****11.**J. S. Farris (1973) A probability model for inferring evolutionary trees,*Syst. Zool.*,**22**, 250-256.**12.**P. Frankl and Z. Fiiredi (1981) A short proof for a theorem of Harper about Hamming spheres,*Discrete Math.***34**, 311-313. MR**83a:05004****13.**H. O. Georgii, (1988)*Gibbs measures and phase transitions*, de Gruyter Studies in Mathematics, 9. Walter de Gruyter and Co., Berlin. MR**89k:82010****14.**G. Grimmett (1999)*Percolation*, Second edition. Springer-Verlag, Berlin. MR**2001a:60114****15.**L. H. Harper (1966) Optimal numberings and isoperimetric problems on graphs.*J. Combinatorial Theory***1**, 385-393. MR**34:91****16.**Y. Higuchi (1977). Remarks on the limiting Gibbs state on a -tree.*Publ. RIMS Kyoto Univ.***13**, 335-348. MR**58:32697****17.**D. Ioffe (1996). A note on the extremality of the disordered state for the Ising model on the Bethe lattice.*Lett. Math. Phys.***37**, 137-143. MR**97e:82004****18.**S. Janson and E. Mossel (2003). Robust reconstruction on trees is determined by the second eigenvalue, submitted.**19.**C. Kenyon, E. Mossel and Y. Peres (2001). Glauber dynamics on trees and hyperbolic graphs, 42nd IEEE Sympos. Foundations of Computer Science (Las Vegas, 2001), pp. 568-578. MR**2003h:68005**.**20.**H. Kesten and B. P. Stigum (1966) Additional limit theorem for indecomposable multidimensional Galton-Watson processes,*Ann. Math. Statist.***37**, 1463-1481. MR**34:864****21.**T. M. Liggett (1985)*Interacting particle systems.**Fundamental Principles of Mathematical Sciences*, 276. Springer-Verlag, New York. MR**86e:60089****22.**R. Lyons (1989) The Ising model and percolation on trees and tree-like graphs.*Commun. Math. Phys.***125**,337-353. MR**90h:82046****23.**E. Mossel (1998) Recursive reconstruction on periodic trees,*Random Structures and Algorithms***13**, 81-97. MR**99h:05106****24.**E. Mossel (2001) Reconstruction on trees: Beating the second eigenvalue,*Ann. Appl. Probab.*,**11**, 285-300. MR**2003d:90010****25.**E. Mossel (2003) On the impossibility of reconstructing ancestral data and phylogenetic trees,*Jour. Comput. Biol.*, to appear.**26.**E. Mossel and Y. Peres (2003) Information flow on trees,*Ann. Appl. Probab.*, to appear.**27.**E. Mossel and M. A. Steel (2003) A phase transition for a random cluster model on phylogenetic trees, submitted.**28.**J. Neyman (1971) Molecular studies of evolution: a source of novel statistical problems. In*Statistical decision theory and related topics*, S.S Gupta and J. Yackel (eds), Academic Press, New York, 1-27. MR**48:5663****29.**Y. Peres (1999) Probability on trees: an introductory climb, in Lectures on probability theory and statistics (Saint-Flour, 1997), 193-280,*Lecture Notes in Math.*,**1717**, Springer, Berlin. MR**2001c:60139****30.**F. Spitzer (1975) Markov random fields on an infinite tree,*Ann. Probab.***3**, 387-394. MR**51:14321****31.**M. A. Steel (1994) Recovering a tree from the leaf colourations it generates under a Markov model,*App. Math. Lett.*,**7(2)**, 19-24.**32.**M.A. Steel, L.A. Székely and M. D. Hendy (1994). Reconstructing trees when sequence sites evolve at variable rates,*Jour. Comput. Biol.*,**1**, 153-163.**33.**M. A. Steel (2001), My favorite conjecture,*http://www.math.canterbury.ac.nz/**mathmas/ conjecture.pdf*.

Retrieve articles in *Transactions of the American Mathematical Society*
with MSC (2000):
60K35,
92D15,
60J85,
82B26

Retrieve articles in all journals with MSC (2000): 60K35, 92D15, 60J85, 82B26

Additional Information

**Elchanan Mossel**

Affiliation:
Department of Statistics, Evans Hall, University of California, Berkeley, California 94720-3860

Email:
mossel@stat.berkeley.edu

DOI:
https://doi.org/10.1090/S0002-9947-03-03382-8

Keywords:
Phylogeny,
phase transition,
Ising model

Received by editor(s):
May 21, 2002

Received by editor(s) in revised form:
April 10, 2003

Published electronically:
October 28, 2003

Additional Notes:
The author was supported by a Miller Fellowship. Most of the research reported here was conducted while the author was a PostDoc in theory group, Microsoft Research.

Article copyright:
© Copyright 2003
American Mathematical Society