Bayesian shape-constrained density estimation
Authors:
Sutanoy Dasgupta, Debdeep Pati and Anuj Srivastava
Journal:
Quart. Appl. Math. 77 (2019), 399-422
MSC (2010):
Primary 65C60, 62G05; Secondary 57N25, 49Q10
DOI:
https://doi.org/10.1090/qam/1529
Published electronically:
January 8, 2019
MathSciNet review:
3932964
Full-text PDF
Abstract |
References |
Similar Articles |
Additional Information
Abstract: The problem of estimating probability densities underlying given i.i.d. samples is a fundamental problem in statistics. Taking a Bayesian nonparametric approach, we put forth a geometric solution that uses different actions of the diffeomorphism (domain warping) group on the set of positive pdfs to explore this space more efficiently. This representation shifts the focus from pdfs to the diffeomorphism group and allows efficient solutions for density estimation under shape (or modality) constraints, i.e., estimation of a pdf given a fixed or a maximum number of modes. Focusing on univariate density estimation, we use the geometry of a (one-dimensional) diffeomorphism group to reach an (approximate) finite-dimensional Euclidean representation of warping functions, and impose a shrinkage prior on this space to form a posterior distribution. We sample this posterior using the Markov Chain Monte Carlo algorithm and form Bayesian estimates of the unknown pdf. This framework results in a novel pdf estimator, with and without shape constraints, and we demonstrate it in a number of simulated and real data experiments.
References
- Yali Amit, Ulf Grenander, and Mauro Piccioni, Structural image restoration through deformable templates, Journal of the American Statistical Association 86 (1991), no. 414, 376–387.
- Richard E Barlow, Statistical inference under order restrictions; the theory and application of isotonic regression, 1972.
- Pierre C. Bellec and Alexandre B. Tsybakov, Sharp oracle bounds for monotone and convex regression through aggregation, J. Mach. Learn. Res. 16 (2015), 1879–1892. MR 3417801
- Anirban Bhattacharya, Debdeep Pati, and David B Dunson, Latent factor density regression models (2012).
- Peter J. Bickel and Jianqing Fan, Some problems on the estimation of unimodal densities, Statist. Sinica 6 (1996), no. 1, 23–45. MR 1379047
- Lucien Birgé, Estimation of unimodal densities without smoothness assumptions, Ann. Statist. 25 (1997), no. 3, 970–981. MR 1447736, DOI https://doi.org/10.1214/aos/1069362733
- Hugh D. Brunk, Estimation of isotonic regression, University of Missouri-Columbia, 1969.
- Lawrence J. Brunner and Albert Y. Lo, Bayes methods for a symmetric unimodal density and its mode, Ann. Statist. 17 (1989), no. 4, 1550–1566. MR 1026299, DOI https://doi.org/10.1214/aos/1176347381
- Sutanoy Dasgupta, Debdeep Pati, Ian H. Jermyn, and Anuj Srivastava, Shape-Constrained Univariate Density Estimation, ArXiv e-prints (2018-04), available at 1804.01458.
- Sutanoy Dasgupta, Debdeep Pati, and Anuj Srivastava, A geometric framework for density modeling, arXiv preprint arXiv:1701.05656 (2017).
- Hassan Doosti and Peter Hall, Making a non-parametric density estimator more attractive, and more accurate, by data perturbation, J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 (2016), no. 2, 445–462. MR 3454204, DOI https://doi.org/10.1111/rssb.12120
- Michael D. Escobar and Mike West, Bayesian density estimation and inference using mixtures, J. Amer. Statist. Assoc. 90 (1995), no. 430, 577–588. MR 1340510
- Fuchang Gao and Jon A. Wellner, Global rates of convergence of the MLE for multivariate interval censoring, Electron. J. Stat. 7 (2013), 364–380. MR 3020425, DOI https://doi.org/10.1214/13-EJS777
- Ulf Grenander, On the theory of mortality measurement. II, Skand. Aktuarietidskr. 39 (1956), 125–153 (1957). MR 93415, DOI https://doi.org/10.1080/03461238.1956.10414944
- U. Grenander, Y. Chow, and D. M. Keenan, Hands, Research Notes in Neural Computing, vol. 2, Springer-Verlag, New York, 1991. A pattern-theoretic study of biological shapes. MR 1084371
- Peter Hall and Li-Shan Huang, Unimodal density estimation using kernel methods, Statist. Sinica 12 (2002), no. 4, 965–990. MR 1947056
- Peter Hall, Simon J. Sheather, M. C. Jones, and J. S. Marron, On optimal data-based bandwidth selection in kernel density estimation, Biometrika 78 (1991), no. 2, 263–269. MR 1131158, DOI https://doi.org/10.1093/biomet/78.2.263
- Clifford Hildreth, Point estimates of ordinates of concave functions, J. Amer. Statist. Assoc. 49 (1954), 598–619. MR 65093
- Nils Lid Hjort and Ingrid K. Glad, Nonparametric density estimation with a parametric start, Ann. Statist. 23 (1995), no. 3, 882–904. MR 1345205, DOI https://doi.org/10.1214/aos/1176324627
- Alan Julian Izenman, Recent developments in nonparametric density estimation, J. Amer. Statist. Assoc. 86 (1991), no. 413, 205–224. MR 1137112
- Sonia Jain and Radford M. Neal, A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model, J. Comput. Graph. Statist. 13 (2004), no. 1, 158–182. MR 2044876, DOI https://doi.org/10.1198/1061860043001
- Maria Kalli, Jim E. Griffin, and Stephen G. Walker, Slice sampling mixture models, Stat. Comput. 21 (2011), no. 1, 93–105. MR 2746606, DOI https://doi.org/10.1007/s11222-009-9150-y
- Roger Koenker and Olga Geling, Reappraising medfly longevity: a quantile regression survival analysis, J. Amer. Statist. Assoc. 96 (2001), no. 454, 458–468. MR 1939348, DOI https://doi.org/10.1198/016214501753168172
- S. Kundu and D. B. Dunson, Latent factor models for density estimation, Biometrika 101 (2014), no. 3, 641–654. MR 3254906, DOI https://doi.org/10.1093/biomet/asu019
- Peter J. Lenk, The logistic normal distribution for Bayesian, nonparametric, predictive densities, J. Amer. Statist. Assoc. 83 (1988), no. 402, 509–516. MR 971380
- Peter J. Lenk, Towards a practicable Bayesian nonparametric density estimator, Biometrika 78 (1991), no. 3, 531–543. MR 1130921, DOI https://doi.org/10.1093/biomet/78.3.531
- Tom Leonard, Density estimation, stochastic processes and prior information, J. Roy. Statist. Soc. Ser. B 40 (1978), no. 2, 113–146. With discussion. MR 517434
- Qi Li and Jeffrey Scott Racine, Nonparametric econometrics, Princeton University Press, Princeton, NJ, 2007. Theory and practice. MR 2283034
- Steven N MacEachern and Peter Müller, Estimating mixture of dirichlet process models, Journal of Computational and Graphical Statistics 7 (1998), no. 2, 223–238.
- Mary C. Meyer, An alternative unimodal density estimator with a consistent estimate of the mode, Statist. Sinica 11 (2001), no. 4, 1159–1174. MR 1867337
- Washington Mio, Anuj Srivastava, and Shantanu Joshi, On shape of plane elastic curves, International Journal of Computer Vision 73 (2007), no. 3, 307–324.
- Peter Müller, Alaattin Erkanli, and Mike West, Bayesian curve fitting using multivariate normal mixtures, Biometrika 83 (1996), no. 1, 67–79. MR 1399156, DOI https://doi.org/10.1093/biomet/83.1.67
- B. L. S. Prakasa Rao, Estimation of a unimodal density, Sankhyā Ser. A 31 (1969), 23–36. MR 267677
- Murray Rosenblatt, Remarks on some nonparametric estimates of a density function, Ann. Math. Statist. 27 (1956), 832–837. MR 79873, DOI https://doi.org/10.1214/aoms/1177728190
- S. J. Sheather and M. C. Jones, A reliable data-based bandwidth selection method for kernel density estimation, J. Roy. Statist. Soc. Ser. B 53 (1991), no. 3, 683–690. MR 1125725
- Allyson Souris, Anirban Bhattacharya, and Debdeep Pati, The soft multivariate truncated normal distribution, arXiv preprint arXiv:1807.09155 (2018).
- Anuj Srivastava, Eric Klassen, Shantanu H. Joshi, and Ian H. Jermyn, Shape analysis of elastic curves in Euclidean spaces, IEEE Trans. PAMI 33 (2011), 1415–1428.
- Anuj Srivastava and Eric P. Klassen, Functional and shape data analysis, Springer Series in Statistics, Springer-Verlag, New York, 2016. MR 3821566
- Surya T. Tokdar, Towards a faster implementation of density estimation with logistic Gaussian process priors, J. Comput. Graph. Statist. 16 (2007), no. 3, 633–655. MR 2351083, DOI https://doi.org/10.1198/106186007X210206
- Surya T. Tokdar, Yu M. Zhu, and Jayanta K. Ghosh, Bayesian density regression with logistic Gaussian process and subspace projection, Bayesian Anal. 5 (2010), no. 2, 319–344. MR 2719655, DOI https://doi.org/10.1214/10-BA605
- Bradley C. Turnbull and Sujit K. Ghosh, Unimodal density estimation using Bernstein polynomials, Comput. Statist. Data Anal. 72 (2014), 13–29. MR 3139345, DOI https://doi.org/10.1016/j.csda.2013.10.021
- J. Wang and S. K. Ghosh, Shape restricted nonparametric regression with Bernstein polynomials, Comput. Statist. Data Anal. 56 (2012), no. 9, 2729–2741. MR 2915158, DOI https://doi.org/10.1016/j.csda.2012.02.018
- Edward J. Wegman, Maximum likelihood estimation of a unimodal density. II, Ann. Math. Statist. 41 (1970), 2169–2174. MR 267681, DOI https://doi.org/10.1214/aoms/1177696724
- Matthew W. Wheeler, David B. Dunson, Sudha P. Pandalai, Brent A. Baker, and Amy H. Herring, Mechanistic hierarchical Gaussian processes, J. Amer. Statist. Assoc. 109 (2014), no. 507, 894–904. MR 3265664, DOI https://doi.org/10.1080/01621459.2014.899234
- Laurent Younes, Computable elastic distances between shapes, SIAM J. Appl. Math. 58 (1998), no. 2, 565–586. MR 1617630, DOI https://doi.org/10.1137/S0036139995287685
- Laurent Younes, Peter W. Michor, Jayant Shah, and David Mumford, A metric on shape space with explicit geodesics, Atti Accad. Naz. Lincei Rend. Lincei Mat. Appl. 19 (2008), no. 1, 25–57. MR 2383560, DOI https://doi.org/10.4171/RLM/506
References
- Yali Amit, Ulf Grenander, and Mauro Piccioni, Structural image restoration through deformable templates, Journal of the American Statistical Association 86 (1991), no. 414, 376–387.
- Richard E Barlow, Statistical inference under order restrictions; the theory and application of isotonic regression, 1972.
- Pierre C. Bellec and Alexandre B. Tsybakov, Sharp oracle bounds for monotone and convex regression through aggregation, J. Mach. Learn. Res. 16 (2015), 1879–1892. MR 3417801
- Anirban Bhattacharya, Debdeep Pati, and David B Dunson, Latent factor density regression models (2012).
- Peter J. Bickel and Jianqing Fan, Some problems on the estimation of unimodal densities, Statist. Sinica 6 (1996), no. 1, 23–45. MR 1379047
- Lucien Birgé, Estimation of unimodal densities without smoothness assumptions, Ann. Statist. 25 (1997), no. 3, 970–981. MR 1447736, DOI https://doi.org/10.1214/aos/1069362733
- Hugh D. Brunk, Estimation of isotonic regression, University of Missouri-Columbia, 1969.
- Lawrence J. Brunner and Albert Y. Lo, Bayes methods for a symmetric unimodal density and its mode, Ann. Statist. 17 (1989), no. 4, 1550–1566. MR 1026299, DOI https://doi.org/10.1214/aos/1176347381
- Sutanoy Dasgupta, Debdeep Pati, Ian H. Jermyn, and Anuj Srivastava, Shape-Constrained Univariate Density Estimation, ArXiv e-prints (2018-04), available at 1804.01458.
- Sutanoy Dasgupta, Debdeep Pati, and Anuj Srivastava, A geometric framework for density modeling, arXiv preprint arXiv:1701.05656 (2017).
- Hassan Doosti and Peter Hall, Making a non-parametric density estimator more attractive, and more accurate, by data perturbation, J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 (2016), no. 2, 445–462. MR 3454204, DOI https://doi.org/10.1111/rssb.12120
- Michael D. Escobar and Mike West, Bayesian density estimation and inference using mixtures, J. Amer. Statist. Assoc. 90 (1995), no. 430, 577–588. MR 1340510
- Fuchang Gao and Jon A. Wellner, Global rates of convergence of the MLE for multivariate interval censoring, Electron. J. Stat. 7 (2013), 364–380. MR 3020425, DOI https://doi.org/10.1214/13-EJS777
- Ulf Grenander, On the theory of mortality measurement. II, Skand. Aktuarietidskr. 39 (1956), 125–153 (1957). MR 0093415
- U. Grenander, Y. Chow, and D. M. Keenan, Hands: A pattern-theoretic study of biological shapes, Research Notes in Neural Computing, vol. 2, Springer-Verlag, New York, 1991. MR 1084371
- Peter Hall and Li-Shan Huang, Unimodal density estimation using kernel methods, Statist. Sinica 12 (2002), no. 4, 965–990. MR 1947056
- Peter Hall, Simon J. Sheather, M. C. Jones, and J. S. Marron, On optimal data-based bandwidth selection in kernel density estimation, Biometrika 78 (1991), no. 2, 263–269. MR 1131158, DOI https://doi.org/10.1093/biomet/78.2.263
- Clifford Hildreth, Point estimates of ordinates of concave functions, J. Amer. Statist. Assoc. 49 (1954), 598–619. MR 0065093
- Nils Lid Hjort and Ingrid K. Glad, Nonparametric density estimation with a parametric start, Ann. Statist. 23 (1995), no. 3, 882–904. MR 1345205, DOI https://doi.org/10.1214/aos/1176324627
- Alan Julian Izenman, Recent developments in nonparametric density estimation, J. Amer. Statist. Assoc. 86 (1991), no. 413, 205–224. MR 1137112
- Sonia Jain and Radford M. Neal, A split-merge Markov chain Monte Carlo procedure for the Dirichlet process mixture model, J. Comput. Graph. Statist. 13 (2004), no. 1, 158–182. MR 2044876, DOI https://doi.org/10.1198/1061860043001
- Maria Kalli, Jim E. Griffin, and Stephen G. Walker, Slice sampling mixture models, Stat. Comput. 21 (2011), no. 1, 93–105. MR 2746606, DOI https://doi.org/10.1007/s11222-009-9150-y
- Roger Koenker and Olga Geling, Reappraising medfly longevity: a quantile regression survival analysis, J. Amer. Statist. Assoc. 96 (2001), no. 454, 458–468. MR 1939348, DOI https://doi.org/10.1198/016214501753168172
- S. Kundu and D. B. Dunson, Latent factor models for density estimation, Biometrika 101 (2014), no. 3, 641–654. MR 3254906, DOI https://doi.org/10.1093/biomet/asu019
- Peter J. Lenk, The logistic normal distribution for Bayesian, nonparametric, predictive densities, J. Amer. Statist. Assoc. 83 (1988), no. 402, 509–516. MR 971380
- Peter J. Lenk, Towards a practicable Bayesian nonparametric density estimator, Biometrika 78 (1991), no. 3, 531–543. MR 1130921, DOI https://doi.org/10.1093/biomet/78.3.531
- Tom Leonard, Density estimation, stochastic processes and prior information, J. Roy. Statist. Soc. Ser. B 40 (1978), no. 2, 113–146. With discussion. MR 517434
- Qi Li and Jeffrey Scott Racine, Nonparametric econometrics: Theory and practice, Princeton University Press, Princeton, NJ, 2007. MR 2283034
- Steven N MacEachern and Peter Müller, Estimating mixture of dirichlet process models, Journal of Computational and Graphical Statistics 7 (1998), no. 2, 223–238.
- Mary C. Meyer, An alternative unimodal density estimator with a consistent estimate of the mode, Statist. Sinica 11 (2001), no. 4, 1159–1174. MR 1867337
- Washington Mio, Anuj Srivastava, and Shantanu Joshi, On shape of plane elastic curves, International Journal of Computer Vision 73 (2007), no. 3, 307–324.
- Peter Müller, Alaattin Erkanli, and Mike West, Bayesian curve fitting using multivariate normal mixtures, Biometrika 83 (1996), no. 1, 67–79. MR 1399156, DOI https://doi.org/10.1093/biomet/83.1.67
- B. L. S. Prakasa Rao, Estimation of a unimodal density, Sankhyā Ser. A 31 (1969), 23–36. MR 0267677
- Murray Rosenblatt, Remarks on some nonparametric estimates of a density function, Ann. Math. Statist. 27 (1956), 832–837. MR 0079873, DOI https://doi.org/10.1214/aoms/1177728190
- S. J. Sheather and M. C. Jones, A reliable data-based bandwidth selection method for kernel density estimation, J. Roy. Statist. Soc. Ser. B 53 (1991), no. 3, 683–690. MR 1125725
- Allyson Souris, Anirban Bhattacharya, and Debdeep Pati, The soft multivariate truncated normal distribution, arXiv preprint arXiv:1807.09155 (2018).
- Anuj Srivastava, Eric Klassen, Shantanu H. Joshi, and Ian H. Jermyn, Shape analysis of elastic curves in Euclidean spaces, IEEE Trans. PAMI 33 (2011), 1415–1428.
- Anuj Srivastava and Eric P. Klassen, Functional and shape data analysis, Springer Series in Statistics, Springer-Verlag, New York, 2016. MR 3821566
- Surya T. Tokdar, Towards a faster implementation of density estimation with logistic Gaussian process priors, J. Comput. Graph. Statist. 16 (2007), no. 3, 633–655. MR 2351083, DOI https://doi.org/10.1198/106186007X210206
- Surya T. Tokdar, Yu M. Zhu, and Jayanta K. Ghosh, Bayesian density regression with logistic Gaussian process and subspace projection, Bayesian Anal. 5 (2010), no. 2, 319–344. MR 2719655, DOI https://doi.org/10.1214/10-BA605
- Bradley C. Turnbull and Sujit K. Ghosh, Unimodal density estimation using Bernstein polynomials, Comput. Statist. Data Anal. 72 (2014), 13–29. MR 3139345, DOI https://doi.org/10.1016/j.csda.2013.10.021
- J. Wang and S. K. Ghosh, Shape restricted nonparametric regression with Bernstein polynomials, Comput. Statist. Data Anal. 56 (2012), no. 9, 2729–2741. MR 2915158, DOI https://doi.org/10.1016/j.csda.2012.02.018
- Edward J. Wegman, Maximum likelihood estimation of a unimodal density. II, Ann. Math. Statist. 41 (1970), 2169–2174. MR 0267681, DOI https://doi.org/10.1214/aoms/1177696724
- Matthew W. Wheeler, David B. Dunson, Sudha P. Pandalai, Brent A. Baker, and Amy H. Herring, Mechanistic hierarchical Gaussian processes, J. Amer. Statist. Assoc. 109 (2014), no. 507, 894–904. MR 3265664, DOI https://doi.org/10.1080/01621459.2014.899234
- Laurent Younes, Computable elastic distances between shapes, SIAM J. Appl. Math. 58 (1998), no. 2, 565–586. MR 1617630, DOI https://doi.org/10.1137/S0036139995287685
- Laurent Younes, Peter W. Michor, Jayant Shah, and David Mumford, A metric on shape space with explicit geodesics, Atti Accad. Naz. Lincei Rend. Lincei Mat. Appl. 19 (2008), no. 1, 25–57. MR 2383560, DOI https://doi.org/10.4171/RLM/506
Similar Articles
Retrieve articles in Quarterly of Applied Mathematics
with MSC (2010):
65C60,
62G05,
57N25,
49Q10
Retrieve articles in all journals
with MSC (2010):
65C60,
62G05,
57N25,
49Q10
Additional Information
Sutanoy Dasgupta
Affiliation:
Department of Statistics, Florida State University, Tallahassee, Florida 32036
Email:
sdasgupta@stat.fsu.edu
Debdeep Pati
Affiliation:
Department of Statistics, Texas A&M University, College Station, Texas 77843
MR Author ID:
948469
Email:
debdeep@stat.tamu.edu
Anuj Srivastava
Affiliation:
Department of Statistics, Florida State University, Tallahassee, Florida 32036
MR Author ID:
614904
Email:
anuj@stat.fsu.edu
Keywords:
Shape analysis,
density estimation,
warping groups,
deformable template,
shape constraints.
Received by editor(s):
April 16, 2018
Received by editor(s) in revised form:
October 8, 2018
Published electronically:
January 8, 2019
Additional Notes:
The second author’s research was supported by NSF DMS 1613156.
The third author’s research was supported in part by the NSF grants to AS – NSF DMS CDS&E 1621787 and NSF CCF 1617397
Dedicated:
This paper is dedicated to Professor Ulf Grenander
Article copyright:
© Copyright 2019
Brown University