Skip to main content
Log in

Computing likelihoods for coalescents with multiple collisions in the infinitely many sites model

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

One of the central problems in mathematical genetics is the inference of evolutionary parameters of a population (such as the mutation rate) based on the observed genetic types in a finite DNA sample. If the population model under consideration is in the domain of attraction of the classical Fleming–Viot process, such as the Wright–Fisher- or the Moran model, then the standard means to describe its genealogy is Kingman’s coalescent. For this coalescent process, powerful inference methods are well-established. An important feature of the above class of models is, roughly speaking, that the number of offspring of each individual is small when compared to the total population size, and hence all ancestral collisions are binary only. Recently, more general population models have been studied, in particular in the domain of attraction of so-called generalised Λ-Fleming–Viot processes, as well as their (dual) genealogies, given by the so-called Λ-coalescents, which allow multiple collisions. Moreover, Eldon and Wakeley (Genetics 172:2621–2633, 2006) provide evidence that such more general coalescents might actually be more adequate to describe real populations with extreme reproductive behaviour, in particular many marine species. In this paper, we extend methods of Ethier and Griffiths (Ann Probab 15(2):515–545, 1987) and Griffiths and Tavaré (Theor Pop Biol 46:131–159, 1994a, Stat Sci 9:307–319, 1994b, Philos Trans Roy Soc Lond Ser B 344:403–410, 1994c, Math Biosci 12:77–98, 1995) to obtain a likelihood based inference method for general Λ-coalescents. In particular, we obtain a method to compute (approximate) likelihood surfaces for the observed type probabilities of a given sample. We argue that within the (vast) family of Λ-coalescents, the parametrisable sub-family of Beta(2 − α, α)-coalescents, where α ∈ (1, 2], are of particular relevance. We illustrate our method using simulated datasets, thus obtaining maximum-likelihood estimators of mutation and demographic parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Árnason E. (2004). Mitochondrial cytochrome b DNA variation in the high-fecundity atlantic cod: trans-Atlantic clines and shallow gene genealogy. Genetics 166: 1871–1885

    Article  Google Scholar 

  2. Berestycki N., Berestycki J. and Schweinsberg J. (2007). Beta-coalescents and continuous stable random trees. Ann. Probab. 35(5): 1835–1887

    Article  MathSciNet  MATH  Google Scholar 

  3. Bertoin J. and Le Gall J.-F. (2003). Stochastic flows associated to coalescent processes. Probab. Theory Related Fields 126(2): 261–288

    Article  MathSciNet  MATH  Google Scholar 

  4. Birkner M., Blath J., Capaldo M., Etheridge A., Möhle M., Schweinsberg J. and Wakolbinger A. (2005). Alpha-stable branching and Beta-coalescents. Electron. J. Probab. 10: 303–325

    MathSciNet  Google Scholar 

  5. http://www.wias-berlin.de/people/birkner/bgt

  6. Birkner, M., Blath, J.: Measure-valued diffusions, general coalescents and population genetic inference. In: Trends in Stochastic Analysis—a Festschrift for Heinrich von Weizsäcker (2007) (to appear)

  7. Boom J.D.G., Boulding E.G. and Beckenbach A.T. (1994). Mitochondrial DNA variation in introduced populations of Pacific oyster, Crassostrea gigas, in British Columbia. Can. J. Fish. Aquat. Sci. 51: 1608–1614

    Article  Google Scholar 

  8. Bovier A. (2006). Statistical Mechanics of Disordered Systems. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  9. Cannings C. (1974). The latent roots of certain Markov chains arising in genetics: a new approach, I. Haploid models. Adv. Appl. Prob. 6: 260–290

    Article  MathSciNet  MATH  Google Scholar 

  10. Cannings C. (1975). The latent roots of certain Markov chains arising in genetics: a new approach, II Further haploid models. Adv. Appl. Prob. 7: 264–282

    Article  MathSciNet  MATH  Google Scholar 

  11. Dawson D. (1993). Lecture Notes, Ecole d’Eté de Probabilités de Saint-Flour XXI. Springer, Berlin

    Book  Google Scholar 

  12. De Iorio M. and Griffiths R.C. (2004). Importance sampling on coalescent histories I. Adv. Appl. Probab. 36: 417–433

    Article  MathSciNet  MATH  Google Scholar 

  13. Donnelly P. and Kurtz T. (1999). Particle representations for measure-valued population models. Ann. Probab. 27(1): 166–20

    Article  MathSciNet  MATH  Google Scholar 

  14. Durrett R. and Schweinsberg J. (2005). A coalescent model for the effect of advantageous mutations on the genealogy of a population. Stoch. Proc. Appl. 115: 1628–1657

    Article  MathSciNet  MATH  Google Scholar 

  15. Eldon B. and Wakeley J. (2006). Coalescent processes when the distribution of offspring number among individuals is highly skewed. Genetics 172: 2621–2633

    Article  Google Scholar 

  16. Ewens W.J. (1979). Mathematical Population Genetics. Springer, Berlin

    MATH  Google Scholar 

  17. Ethier S. and Griffiths R.C. (1987). The infinitely-many-sites model as a measure-valued diffusion. Ann. Probab. 15(2): 515–545

    Article  MathSciNet  MATH  Google Scholar 

  18. Ethier S. and Kurtz T. (1986). Markov Processes: Characterization and Convergence. Wiley, New York

    MATH  Google Scholar 

  19. Ethier S. and Kurtz T. (1993). Fleming–Viot processes in population genetics. SIAM J. Control Optim. 31(2): 345–386

    Article  MathSciNet  MATH  Google Scholar 

  20. Felsenstein J., Kuhner M.K., Yamato J. and Beerli P. (1999). Likelihoods on coalescents: a Monte Carlo sampling approach to inferring parameters from population samples of molecular data. IMS Lecture Notes Monogr Ser 33: 163–185

    Article  MathSciNet  Google Scholar 

  21. Griffiths R.C. (1989). Genealogical-tree probabilities in the infinitely-many-site model. J. Math. Biol. 27(6): 667–680

    MathSciNet  MATH  Google Scholar 

  22. Griffiths R.C. and Tavaré S. (1994). Simulating probability distributions in the coalescent. Theor. Pop. Biol. 46: 131–159

    Article  MATH  Google Scholar 

  23. Griffiths R.C. and Tavaré S. (1994). Ancestral inference in population genetics. Stat. Sci.e 9: 307–319

    Article  MATH  Google Scholar 

  24. Griffiths R.C. and Tavaré S. (1994). Sampling theory for neutral alleles in a varying environment. Philos. Trans. Roy. Soc. Lond. Ser B 344: 403–410

    Article  Google Scholar 

  25. Griffiths R.C. and Tavaré S. (1995). Unrooted genealogical tree probabilities in the infinitely-many-sites model. Math. Biosci. 127: 77–98

    Article  MATH  Google Scholar 

  26. Griffiths R.C. and Tavaré S. (1996). Monte Carlo inference methods in population genetics. Monte Carlo and quasi-Monte Carlo methods. Math. Comput. Model. 23(8–9): 141–158

    Article  MATH  Google Scholar 

  27. Griffiths R.C. and Tavaré S. (1996). Markov chain inference methods in population genetics. Math. Comput. Model. 23(8/9): 141–158

    Article  MATH  Google Scholar 

  28. Griffiths, R.C., Tavaré, S.: Computational Methods for the coalescent. Progress in Population Genetics and Human Evolution, pp. 165–182. Springer, Heidelberg (1997)

  29. Gusfield D. (1991). Efficient algorithms for inferring evolutionary trees. Networks 21(1): 19–28

    Article  MathSciNet  MATH  Google Scholar 

  30. Fred M. (1984). Hoppe, Pólya-like urns and the Ewens’ sampling formula. J. Math. Biol. 20(1): 91–94

    Article  MathSciNet  MATH  Google Scholar 

  31. Hudson R.R. (1990). Gene genealogies and the coalescent process. Oxford Surv. Evolut. Biol. 7: 1–44

    Google Scholar 

  32. Hein J., Schierup M.H. and Wiuf C. (2005). Gene Genealogies, Variation and Evolution – A Primer in Coalescent Theory. Oxford University Press, Oxford

    MATH  Google Scholar 

  33. Kimura M. (1969). The number of heterozygous nucleotide sites maintained in a finite population due to a steady flux of mutations. Genetics 61: 893–903

    Google Scholar 

  34. Kingman J.F.C (1982). The coalescent. Stoch. Proc. Appl. 13: 235–248

    Article  MathSciNet  MATH  Google Scholar 

  35. Möhle M. (2006). On sampling distributions for coalescent processes with simultaneous multiple collisions.. Bernoulli 12: 35–53

    MathSciNet  MATH  Google Scholar 

  36. Möhle M. and Sagitov S. (2001). A classification of coalescent processes for haploid exchangeable population models. Ann. Probab. 29: 1547–1562

    Article  MathSciNet  MATH  Google Scholar 

  37. Nordborg M. (2001). Coalescent Theory. In: Balding, D., Bishop, M. and Cannings, D. (eds) Handbook of Statistical genetics, pp 179–208. Wiley, New York

    Google Scholar 

  38. Pitman J. (1999). Coalescents with multiple collisions. Ann. Probab. 27(4): 1870–1902

    Article  MathSciNet  MATH  Google Scholar 

  39. Rogers L.C.G. and Williams D. (1994). Diffusions, Markov Processes and Martingales, vol. 1, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  40. Sagitov S. (1999). The general coalescent with asynchronous mergers of ancestral lines. J. Appl. Probab. 36(4): 1116–1125

    Article  MathSciNet  MATH  Google Scholar 

  41. Schweinsberg J. (2000). A necessary and sufficient condition for the Λ-coalescent to come down from infinity. Electron. Commun. Probab. 5: 1–11

    MathSciNet  MATH  Google Scholar 

  42. Schweinsberg J. (2003). Coalescent processes obtained from supercritical Galton-Watson processes. Stoch. Proc. Appl. 106: 107–139

    MathSciNet  MATH  Google Scholar 

  43. Stephens M. and Donnelly P. (2000). Inference in molecular population genetics. J. Roy. Stat. Soc. B. 62: 605–655

    Article  MathSciNet  MATH  Google Scholar 

  44. Studier J. and Keppler K. (1988). A note on the neighbor-joining algorithm of Saitou and Nei. Mol. Biol. Evol. 5: 729–731

    Google Scholar 

  45. Tavaré, S.: Ancestral Inference in Population Genetics. Springer Lecture Notes, vol. 1837 (2001)

  46. Wakeley, J.: Coalescent theory. (to appear) (2007)

  47. Waterman M.S., Smith T.F., Singh M. and Beyer W.A. (1977). Additive evolutionary trees. J. Theor. Bio. 64: 199–213

    Article  MathSciNet  Google Scholar 

  48. Watterson G.A. (1975). On the number of segregating sites in genetical models without recombination. Theor. Popul. Biol. 10: 256–276

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jochen Blath.

Additional information

This work has been partially supported by EPSRC GR/R985603.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Birkner, M., Blath, J. Computing likelihoods for coalescents with multiple collisions in the infinitely many sites model. J. Math. Biol. 57, 435–465 (2008). https://doi.org/10.1007/s00285-008-0170-6

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00285-008-0170-6

Keywords

Mathematics Subject Classification (2000)

Navigation