Abstract
In this paper we study how to invert random functions under different criteria. The motivation for this study is phylogeny reconstruction, since the evolution of biomolecular sequences may be considered as a random function from the set of possible phylogenetic trees to the set of collections of biomolecular sequences of observed species. Our results may affect how we think about maximum likelihood estimation (MLE) in phylogeny. For inverting random functions, MLE is optimal under a first criterion, although it is not optimal under a second criterion which is at least equally natural but more conservative. Furthemore, MLE has to be used differently from the way it has been used in the phylogeny literature, if we have a prior distribution on trees and mutation mechanisms and want to keep MLE optimal under the same first criterion. Some of the results of this paper have been known in the setting of statistical decision theory, but have never been discussed in the context of phylogeny.
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
References
D.J. Aldous, Probability distributions on cladograms, In: Discrete Random Structures, D.J. Aldous and R. Permantle, Eds., IMA Vol. in Mathematics and its Applications, Vol. 76, Springer-Verlag, 1995, pp. 1–18.
A. Ambainis, R. Desper, M. Farach, and S. Kannan, Nearly tight bounds on the learnability of evolution, Proc. of the 38th IEEE Conference on the Foundations of Computer Science (FOCS'97), 1997, pp. 524–533.
J.O. Berger, Statistical Decision Theory and Bayesian Analysis, 2nd Ed., Springer Series in Statistics, Springer-Verlag, 1985.
J.K.M. Brown, Probabilities of evolutionary trees, Syst. Biol.43 (1994) 78–91.
J.A. Cavender, Taxonomy with confidence, Math. Biosci.40 (1978) 270–280.
M. Cryan, L.A. Goldberg, and P.W. Goldberg, Evolutionary trees can be learned in polynomial time in the two-state general Markov model, Proc. 39th Foundations of Computer Science (FOCS'98), 1998, pp. 436–445.
R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, Biological sequence analysis: Probabilistic models of proteins and nucleic acids, Cambridge University Press, Cambridge, 1998.
P.L. Erdős, M.A. Steel, L.A. Székely, and T. Warnow, A few logs suffice to build (almost) all trees I, Random Structures and Algorithms14 (1999) 153–184.
P.L. Erdős, M.A. Steel, L.A. Székely, and T. Warnow, A few logs suffice to build (almost) all trees II, to appear in Theoretical Computer Science.
M. Farach and S. Kannan, Efficient algorithms for inverting evolution, Proceedings of the ACM Symposium on the Foundations of Computer Science, 1996, pp. 230–236.
J. Felsenstein, Cases in which parsimony or compatibility methods will be positively misleading, Syst. Zool.27 (1978) 401–410.
J. Felsenstein, Evolutionary trees from DNA sequences: A maximum likelihood approach, J. Mol. Evol.17 (1981) 368–376.
E.F. Harding, The probabilities of rooted tree shapes generated by random bifurcation, Adv. Appl. Probab.3 (1971) 44–77.
M. Kimura, Estimation of evolutionary distances between homologous nucleotide sequences, Proc. Nat. Acad. Sci. USA78 (1981) 454–458.
M. Marcus and H. Minc, A Survey of Matrix Theory and Matrix Inequalities, Dover Publications Inc., New York, 1992.
J. Neyman, Molecular studies of evolution: A source of novel statistical problems, In: Statistical Decision Theory and Related Topics, S.S. Gupta and J. Yackel, Eds., Academic Press, New York, 1971, pp. 1–27.
A. Schrijver, Theory of Linear and Integer Programming, Wiley-Interscience Series in Discrete Mathematics, John Wiley & Sons Ltd., Chichester, 1986.
M.E. Siddall, Success of parsimony in the four-taxon case: Long-branch repulsion by likelihood in the Farris Zone, Cladistics14 (1998) 209–220.
M. Steel, M.D. Hendy, and D. Penny, Reconstructing trees from nucleotide pattern probabilities: A survey and some new results, Discrete Appl. Math.88 (1998) 367–396.
D.L. Swofford, G.J. Olsen, P.J. Waddell, and D.M. Hillis, Chapter 11: Phylogenetic inference, In: Molecular Systematics, D.M. Hillis, C. Moritz, and B.K. Mable, Eds., 2nd Ed., Sinauer Associates, Inc., Sunderland, 1996, pp. 407–514.
Z. Yang and B. Rannala, A Markov Chain Monte Carlo Method, Mol. Biol. Evol.14 (1997) 714–724.
Author information
Authors and Affiliations
Additional information
Supported by the New Zealand Marsden Fund.
Supported by the National Science Foundation grant DMS 9701211, and the Hungarian National Science Fund contract T 016 358.
Rights and permissions
About this article
Cite this article
Steel, M.A., Székely, L.A. Inverting random functions. Annals of Combinatorics 3, 103–113 (1999). https://doi.org/10.1007/BF01609880
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF01609880