Abstract
One of the central problems in mathematical genetics is the inference of evolutionary parameters of a population (such as the mutation rate) based on the observed genetic types in a finite DNA sample. If the population model under consideration is in the domain of attraction of the classical Fleming–Viot process, such as the Wright–Fisher- or the Moran model, then the standard means to describe its genealogy is Kingman’s coalescent. For this coalescent process, powerful inference methods are well-established. An important feature of the above class of models is, roughly speaking, that the number of offspring of each individual is small when compared to the total population size, and hence all ancestral collisions are binary only. Recently, more general population models have been studied, in particular in the domain of attraction of so-called generalised Λ-Fleming–Viot processes, as well as their (dual) genealogies, given by the so-called Λ-coalescents, which allow multiple collisions. Moreover, Eldon and Wakeley (Genetics 172:2621–2633, 2006) provide evidence that such more general coalescents might actually be more adequate to describe real populations with extreme reproductive behaviour, in particular many marine species. In this paper, we extend methods of Ethier and Griffiths (Ann Probab 15(2):515–545, 1987) and Griffiths and Tavaré (Theor Pop Biol 46:131–159, 1994a, Stat Sci 9:307–319, 1994b, Philos Trans Roy Soc Lond Ser B 344:403–410, 1994c, Math Biosci 12:77–98, 1995) to obtain a likelihood based inference method for general Λ-coalescents. In particular, we obtain a method to compute (approximate) likelihood surfaces for the observed type probabilities of a given sample. We argue that within the (vast) family of Λ-coalescents, the parametrisable sub-family of Beta(2 − α, α)-coalescents, where α ∈ (1, 2], are of particular relevance. We illustrate our method using simulated datasets, thus obtaining maximum-likelihood estimators of mutation and demographic parameters.
Similar content being viewed by others
References
Árnason E. (2004). Mitochondrial cytochrome b DNA variation in the high-fecundity atlantic cod: trans-Atlantic clines and shallow gene genealogy. Genetics 166: 1871–1885
Berestycki N., Berestycki J. and Schweinsberg J. (2007). Beta-coalescents and continuous stable random trees. Ann. Probab. 35(5): 1835–1887
Bertoin J. and Le Gall J.-F. (2003). Stochastic flows associated to coalescent processes. Probab. Theory Related Fields 126(2): 261–288
Birkner M., Blath J., Capaldo M., Etheridge A., Möhle M., Schweinsberg J. and Wakolbinger A. (2005). Alpha-stable branching and Beta-coalescents. Electron. J. Probab. 10: 303–325
Birkner, M., Blath, J.: Measure-valued diffusions, general coalescents and population genetic inference. In: Trends in Stochastic Analysis—a Festschrift for Heinrich von Weizsäcker (2007) (to appear)
Boom J.D.G., Boulding E.G. and Beckenbach A.T. (1994). Mitochondrial DNA variation in introduced populations of Pacific oyster, Crassostrea gigas, in British Columbia. Can. J. Fish. Aquat. Sci. 51: 1608–1614
Bovier A. (2006). Statistical Mechanics of Disordered Systems. Cambridge University Press, Cambridge
Cannings C. (1974). The latent roots of certain Markov chains arising in genetics: a new approach, I. Haploid models. Adv. Appl. Prob. 6: 260–290
Cannings C. (1975). The latent roots of certain Markov chains arising in genetics: a new approach, II Further haploid models. Adv. Appl. Prob. 7: 264–282
Dawson D. (1993). Lecture Notes, Ecole d’Eté de Probabilités de Saint-Flour XXI. Springer, Berlin
De Iorio M. and Griffiths R.C. (2004). Importance sampling on coalescent histories I. Adv. Appl. Probab. 36: 417–433
Donnelly P. and Kurtz T. (1999). Particle representations for measure-valued population models. Ann. Probab. 27(1): 166–20
Durrett R. and Schweinsberg J. (2005). A coalescent model for the effect of advantageous mutations on the genealogy of a population. Stoch. Proc. Appl. 115: 1628–1657
Eldon B. and Wakeley J. (2006). Coalescent processes when the distribution of offspring number among individuals is highly skewed. Genetics 172: 2621–2633
Ewens W.J. (1979). Mathematical Population Genetics. Springer, Berlin
Ethier S. and Griffiths R.C. (1987). The infinitely-many-sites model as a measure-valued diffusion. Ann. Probab. 15(2): 515–545
Ethier S. and Kurtz T. (1986). Markov Processes: Characterization and Convergence. Wiley, New York
Ethier S. and Kurtz T. (1993). Fleming–Viot processes in population genetics. SIAM J. Control Optim. 31(2): 345–386
Felsenstein J., Kuhner M.K., Yamato J. and Beerli P. (1999). Likelihoods on coalescents: a Monte Carlo sampling approach to inferring parameters from population samples of molecular data. IMS Lecture Notes Monogr Ser 33: 163–185
Griffiths R.C. (1989). Genealogical-tree probabilities in the infinitely-many-site model. J. Math. Biol. 27(6): 667–680
Griffiths R.C. and Tavaré S. (1994). Simulating probability distributions in the coalescent. Theor. Pop. Biol. 46: 131–159
Griffiths R.C. and Tavaré S. (1994). Ancestral inference in population genetics. Stat. Sci.e 9: 307–319
Griffiths R.C. and Tavaré S. (1994). Sampling theory for neutral alleles in a varying environment. Philos. Trans. Roy. Soc. Lond. Ser B 344: 403–410
Griffiths R.C. and Tavaré S. (1995). Unrooted genealogical tree probabilities in the infinitely-many-sites model. Math. Biosci. 127: 77–98
Griffiths R.C. and Tavaré S. (1996). Monte Carlo inference methods in population genetics. Monte Carlo and quasi-Monte Carlo methods. Math. Comput. Model. 23(8–9): 141–158
Griffiths R.C. and Tavaré S. (1996). Markov chain inference methods in population genetics. Math. Comput. Model. 23(8/9): 141–158
Griffiths, R.C., Tavaré, S.: Computational Methods for the coalescent. Progress in Population Genetics and Human Evolution, pp. 165–182. Springer, Heidelberg (1997)
Gusfield D. (1991). Efficient algorithms for inferring evolutionary trees. Networks 21(1): 19–28
Fred M. (1984). Hoppe, Pólya-like urns and the Ewens’ sampling formula. J. Math. Biol. 20(1): 91–94
Hudson R.R. (1990). Gene genealogies and the coalescent process. Oxford Surv. Evolut. Biol. 7: 1–44
Hein J., Schierup M.H. and Wiuf C. (2005). Gene Genealogies, Variation and Evolution – A Primer in Coalescent Theory. Oxford University Press, Oxford
Kimura M. (1969). The number of heterozygous nucleotide sites maintained in a finite population due to a steady flux of mutations. Genetics 61: 893–903
Kingman J.F.C (1982). The coalescent. Stoch. Proc. Appl. 13: 235–248
Möhle M. (2006). On sampling distributions for coalescent processes with simultaneous multiple collisions.. Bernoulli 12: 35–53
Möhle M. and Sagitov S. (2001). A classification of coalescent processes for haploid exchangeable population models. Ann. Probab. 29: 1547–1562
Nordborg M. (2001). Coalescent Theory. In: Balding, D., Bishop, M. and Cannings, D. (eds) Handbook of Statistical genetics, pp 179–208. Wiley, New York
Pitman J. (1999). Coalescents with multiple collisions. Ann. Probab. 27(4): 1870–1902
Rogers L.C.G. and Williams D. (1994). Diffusions, Markov Processes and Martingales, vol. 1, 2nd edn. Wiley, New York
Sagitov S. (1999). The general coalescent with asynchronous mergers of ancestral lines. J. Appl. Probab. 36(4): 1116–1125
Schweinsberg J. (2000). A necessary and sufficient condition for the Λ-coalescent to come down from infinity. Electron. Commun. Probab. 5: 1–11
Schweinsberg J. (2003). Coalescent processes obtained from supercritical Galton-Watson processes. Stoch. Proc. Appl. 106: 107–139
Stephens M. and Donnelly P. (2000). Inference in molecular population genetics. J. Roy. Stat. Soc. B. 62: 605–655
Studier J. and Keppler K. (1988). A note on the neighbor-joining algorithm of Saitou and Nei. Mol. Biol. Evol. 5: 729–731
Tavaré, S.: Ancestral Inference in Population Genetics. Springer Lecture Notes, vol. 1837 (2001)
Wakeley, J.: Coalescent theory. (to appear) (2007)
Waterman M.S., Smith T.F., Singh M. and Beyer W.A. (1977). Additive evolutionary trees. J. Theor. Bio. 64: 199–213
Watterson G.A. (1975). On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 10: 256–276
Author information
Authors and Affiliations
Corresponding author
Additional information
This work has been partially supported by EPSRC GR/R985603.
Rights and permissions
About this article
Cite this article
Birkner, M., Blath, J. Computing likelihoods for coalescents with multiple collisions in the infinitely many sites model. J. Math. Biol. 57, 435–465 (2008). https://doi.org/10.1007/s00285-008-0170-6
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-008-0170-6
Keywords
- Λ-coalescent
- Likelihood-based inference
- Infinitely-many-sitesmodel
- Population genetics
- Fleming–Viot process
- Multiple collisions
- Monte-Carlo method