Summary
For bandwidth selection of a kernel density estimator, a generalization of the widely studied least squares cross-validation method is considered. The essential idea is to do a particular type of “presmoothing” of the data. This is seen to be essentially the same as using the smoothed bootstrap estimate of the mean integrated squared error. Analysis reveals that a rather large amount of presmoothing yields excellent asymptotic performance. The rate of convergence to the optimum is known to be best possible under a wide range of smoothness conditions. The method is more appealing than other selectors with this property, because its motivation is not heavily dependent on precise asymptotic analysis, and because its form is simple and intuitive. Theory is also given for choice of the amount of presmoothing, and this is used to derive a data-based method for this choice.
Article PDF
Similar content being viewed by others
References
Bickel, P., Ritov, Y.: Estimating integrated squared density derivatives. Sankhyā Ser. A.50, 381–393 (1988)
Bowman, A. W.: An alternative method of cross-validation for the smoothing of density estimates. Biometrika71, 353–360 (1984)
Burkholder, D.L.: Distribution function inequalities for martingales. Ann. Probab.1, 19–42 (1973)
Burman, P.: A data dependent approach to density estimation, Z. Wahrscheinlichkeits theor. Verw. Geb.69, 609–628 (1985)
Chiu, S.-T.: Bandwidth selection for kernel density estimation. Ann. Stat. to appear (1991)
Devroye, L., Györfi, L.: Nonparametric density estimation: TheL 1 View. New York: Wiley 1984
Diggle, P.J.: Statistical analysis of point patterns London: Academic Press 1983
Diggle, P.J.: A kernel method for smoothing point process data. Appl. Stat.34, 138–147 (1985)
Diggle, P.J., Marron, J.S.: Equivalence of smoothing parameter selectors in density and intensity estimation. J. Am. Stat. Assoc.83, 793–800 (1988)
Fubank, R.L.: Spline smoothing and nonparametric regression. New York: Dekker 1988
Faraway, J.J., Jhun, M.: Bootstrap choice of bandwidth for density estimation. J. Am. Stat. Assoc.85, 1119–1122 (1990)
Gasser, T., Müller, H-G., Mammitzsch, V.: Kernels for nonparametric curve estimation. J. R. Stat. Soc., Ser.B47, 238–252 (1985)
Härdle, W.: Applied nonparametric regression. Econometrics Society Monograph Series, No. 19, Cambridge: Cambridge University Press 1989
Härdle, W., Hall, P., Marron, J.S.: How far are automatically chosen regression smoothers from their optimum? (with discussion). J. Am. Stat. Assoc.83, 86–95 (1988)
Hall, P.: Objective methods for the estimation of window size in the nonparametric estimation of a density (unpublished manuscript, 1980)
Hall, P.: Large sample optimality of least squares cross-validation in density estimation. Ann. Stat.11, 1156–1174 (1983)
Hall, P.: Central limit theorem for integrated squared error of multivariate density estimators. J. Multivariate Anal.14, 1–16 (1984)
Hall, P., Marron, J.S.: Extent to which least-squares cross-validation minimises integrated square error in nonparametric density estimation. Probab. Th. Rel. Fields74, 567–581 (1987a)
Hall, P., Marron, J.S.: On the amount of noise inherent in bandwidth selection for a kernel density estimator. Ann. Stat. 15, 163–181 (1987b)
Hall, P., Marron, J.S.: Estimation of integrated squared density derivatives. Stat. Probab. Lett.6, 109–115 (1987c)
Hall, P., Marron, J.S.: Lower bounds for bandwidth selection in density estimation (unpublished manuscript, 1989)
Hall, P., Sheather, S., Jones, M.C., Marron, J.S.: On optimal data-based bandwidth selection in kernel density estimation. (unpublished manuscript, 1989)
Marron, J.S.: Automatic smoothing parameter selection: A survey. Emp. Econ.13, 187–208 (1988)
Müller, H.G.: Empirical bandwidth choice for nonparametric kernel regression by means of pilot estimators. Stat. Decis.2, [Suppl] 193–206 (1985)
Müller, H.G.: Nonparametric analysis of longitudinal data. Berlin Heidelberg New York: Springer 1988
Park, B.U., Marron, J.S.: Comparison of data-driven bandwidth selectors. J. Am. Stat. Assoc.85, 66–72 (1990)
Ripley, B.D.: Spatial statistics New York: Wiley 1981
Rudemo, M.: Empirical choice of histograms and kernel density estimators. Scand. J. Stat.9, 65–78 (1982)
Scott, D.W.: Averaged shifted histograms: effective nonparametric density estimation in several dimensions. Ann. Stat.4, 1024–1040 (1985)
Scott, D.W., Factor, L.E.: Monte Carlo study of three data-based nonparametric density estimators. J. Am. Stat. Assoc.76, 9–15 (1981)
Scott, D.W., Terrell, G.R.: Biased and unbiased cross-validation in density estimation. J. Am. Stat. Assoc.82, 1131–1146 (1987)
Scott, D.W., Tapia, R.A., Thompson, J.W.: Kernel density estimation revisited. J. Nonlinear Anal., Theor. Methods Appl.1, 339–372 (1977)
Sheather, S.J.: An improved data-based algorithm for choosing the window width when estimating the density at a point. Comput. Stat. Data Anal.4, 61–65 (1986)
Silverman, B.W.: Density estimation for statistics and data analysis. New York: Chapman and Hall 1986
Staniswallis, J.G.: Local bandwidth selection for kernel estimates. J. Am. Stat. Assoc.84, 284–288 (1987)
Stone, C.J.: An asymptotically optimal window selection rule for kernel density estimates. Ann. Stat.12, 1285–1297 (1984)
Taylor, C.C.: Bootstrap choice of the smoothing parameter in kernel density estimation Biometrika76, 705–712 (1989)
Woodroofe, M.: On choosing a delta sequence. Ann. Math. Stat.41, 1665–1671 (1970)
Author information
Authors and Affiliations
Additional information
Research of the second author was done while on leave from the University of North Carolina. That of both the second and third was partially supported by National Science Foundation Grants DMS-8701201 and DMS-8902973
Rights and permissions
About this article
Cite this article
Hall, P., Marron, J.S. & Park, B.U. Smoothed cross-validation. Probab. Th. Rel. Fields 92, 1–20 (1992). https://doi.org/10.1007/BF01205233
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01205233