|
From information scaling of natural images to regimes of statistical models
Author(s):
Ying
Nian
Wu;
Cheng-En
Guo;
Song-Chun
Zhu
Journal:
Quart. Appl. Math.
66
(2008),
81-122.
MSC (2000):
Primary 62M40
Posted:
December 5, 2007
Retrieve article in:
PDF DVI PostScript
Abstract |
References |
Similar articles |
Additional information
Abstract:
Vision can be considered a highly specialized data collection and analysis problem. We need to understand the special properties of natural image data in order to construct statistical models and develop statistical methods for representing and recognizing the wide variety of natural image patterns. One fundamental property of natural image data that distinguishes vision from other sensory tasks such as speech recognition is that scale plays a profound role in image formation and interpretation. Specifically, visual objects can appear at a wide range of scales in the images due to the change of viewing distance as well as camera resolution. The same objects appearing at different scales produce different image data with different statistical properties. In particular, we show that the entropy rate of the image data changes over scale. Moreover, the inferential uncertainty changes over scale too. We call these changes information scaling. We then examine both empirically and theoretically two prominent and yet largely isolated classes of image models, namely, wavelet sparse coding models and Markov random field models. Our results indicate that the two classes of models are appropriate for two different entropy regimes: sparse coding targets low entropy regimes, whereas Markov random fields are appropriate for high entropy regimes. Because information scaling connects different entropy regimes, both sparse coding and Markov random fields are necessary for representing natural image data, and information scaling triggers transitions between these two regimes. This motivates us to propose a modeling scheme that embraces both regimes of models in a common framework. The contribution of our work is two-fold. First, the study of information scaling provides a unifying perspective for the rich variety of natural image patterns. Second, the modeling scheme that we develop provides a natural integration of different regimes of image models.
References:
-
- 1.
- P. H. Algoet and T. M. Cover, ``A sandwich proof of the Shannon-McMillan-Breiman theorem,'' Annals of Probability, 16, 899-909, 1988. MR 929085 (89b:94011)
- 2.
- L. Alvarez, Y. Gousseau, and J. M. Morel, ``The size of objects in natural and artificial images,'' Advances in Imaging and Electron Physics, 111, 167-242, 1999.
- 3.
- A. R. Barron, ``The strong ergodic theorem for densities: Generalized Shannon-McMillan-Breiman theorem,'' Annals of Probability, 13, 1292-1303, 1985. MR 806226 (86k:94023)
- 4.
- A. R. Barron, ``Entropy and the central limit theorem,'' Annals of Probability, 14, 336-342, 1986. MR 815975 (87h:60048)
- 5.
- A. Bell, and T. J. Sejnowski, ``The `independent components' of natural scenes are edge filters,'' Vision Research, 37, 3327-3338, 1997.
- 6.
- J. Besag, ``Spatial interaction and the statistical analysis of lattice systems (with discussion),'' Journal of Royal Statistics Society, B, 36, 192-236, 1974. MR 0373208 (51:9409)
- 7.
- P. Burt and E. H. Adelson, ``The Laplacian pyramid as a compact image code,'' IEEE Transactions on Communication, 31, 532-540, 1983.
- 8.
- E. J. Candès and D. L. Donoho, ``Curvelets - a surprisingly effective nonadaptive representation for objects with edges,'' Curves and Surfaces, L. L. Schumakeretal. (eds), Vanderbilt University Press, Nashville, TN, 1999.
- 9.
- J. Canny, ``A computational approach to edge detection,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 679-698, 1986.
- 10.
- T. F. Chan and J. Shen, ``Mathematical models for local nontexture inpaintings,'' SIAM Journal of Applied Mathematics, 62(3), 1019-1043, 2001. MR 1897733 (2003f:65110)
- 11.
- D. Chandler, Introduction to Modern Statistical Mechanics, The Clarendon Press, Oxford University Press, New York, 1987. MR 913936 (89d:82001)
- 12.
- T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley, New York, 1991. MR 1122806 (92g:94001)
- 13.
- J. Daugman, ``Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters,'' Journal of Optical Society of America, 2, 1160-1169, 1985.
- 14.
- S. Della Pietra, V. Della Pietra, and J. Lafferty, ``Inducing features of random fields,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4), 380-393, 1997.
- 15.
- D. L. Donoho, M. Vetterli, R. A. DeVore, and I. Daubechies, ``Data compression and harmonic analysis,'' IEEE Trans. Information Theory. 6, 2435-2476, 1998. MR 1658775 (99i:94028)
- 16.
- J. H. Elder and S. W. Zucker, ``Local scale control for edge detection and blur estimation,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(7), 699-716, 1998.
- 17.
- D. J. Field, ``What is the goal of sensory coding?'' Neural Computation, 6, 559-601, 1994.
- 18.
- Y. Freund and R. E. Schapire, ``A decision-theoretic generalization of on-line learning and an application to boosting,'' Journal of Computer and System Sciences, 55, 119-139, 1997. MR 1473055 (99g:68172)
- 19.
- J. H. Friedman, ``Exploratory projection pursuit,'' Journal of the American Statistical Association, 82, 249, 1987. MR 883353 (88c:62004)
- 20.
- S. Geman and D. Geman, ``Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721-741, 1984.
- 21.
- S. Geman and C. Graffigne, ``Markov random field image models and their applications to computer vision,'' Proceedings of the International Congress of Mathematicians, 1, 1496-1517, 1987. MR 934354
- 22.
- S. Geman, D. F. Potter, and Z. Chi, ``Composition system,'' Quarterly of Applied Math, 60(4), 707-736, 2002. MR 1939008 (2003i:68129)
- 23.
- U. Grenander, General Pattern Theory, The Clarendon Press, Oxford Univ Press, New York, 1993. MR 1270904 (96e:68118)
- 24.
- C. Guo, S. C. Zhu, and Y. N. Wu, ``Primal sketch: Integrating structure and texture,'' Computer Vision and Image Understanding, 106, 5-19, 2007.
- 25.
- J. Hammersley and P. Clifford, Markov Fields on Finite Graphs and Lattices, Preprint, UC. Berkeley, 1968.
- 26.
- D. J. Heeger and J. R. Bergen, ``Pyramid based texture analysis/synthesis,'' Computer Graphics Proceedings, 229-238, 1995.
- 27.
- D. Huber and T. Wiesel, ``Receptive fields, binocular interaction and functional architecture in the cat's visual cortex,'' Journal of Physiology, 160, 1962.
- 28.
- O. Johnson, ``An information theoretical central limit theorem for finitely susceptible FKG systems,'' technical report, 2004.
- 29.
- T. S. Lee, ``Image representation using 2D Gabor wavelets,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 10, 959-971, 1996.
- 30.
- M. S., Lewicki and B. A. Olshausen, ``Probabilistic framework for the adaptation and comparison of image codes,'' Journal of the Optical Society of America, 16(7), 1587-1601, 1999.
- 31.
- T. Lindeberg, Scale-Space Theory in Computer Vision, Kluwer Academic Publishers, 1994.
- 32.
- A. Lee, D. Mumford, and J. Huang, ``Occlusion models for natural images: A statistical study of a scale-invariant dead leaves model,'' International Journal of Computer Vision, 41(1/2), 35-59, 2001.
- 33.
- S. Mallat, ``A theory of multiresolution signal decomposition: The wavelet representation,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7), 674-693, 1989.
- 34.
- S. Mallat and Z. Zhang, ``Matching pursuit in a time-frequency dictionary,'' IEEE Transactions on Signal Processing, 41, 3397-415, 1993.
- 35.
- B. B. Mandelbrot, The Fractal Geometry of Nature, Freeman, San Francisco, CA, 1982. MR 665254 (84h:00021)
- 36.
- D. Marr, Vision, W. H. Freeman and Company, San Francisco, CA, 1982.
- 37.
- S. G. Matheron, Random Sets and Integral Geometry, John Wiley and Sons, 1975. MR 0385969 (52:6828)
- 38.
- L. Moisan, A. Desolneux, and J.-M. Morel, ``Meaningful alignments,'' International Journal of Computer Vision, 40, 1, 7-23, 2000.
- 39.
- D. B. Mumford, ``Pattern theory: A unifying perspective,'' Proceedings of 1st European Congress of Mathematics, Birkhäuser-Boston, 1994. MR 1341824
- 40.
- D. Mumford and B. Gidas, ``Stochastic models for generic images'', Quarterly of Applied Math, 59(1), 85-111, 2001. MR 1811096 (2001m:68166)
- 41.
- C. M. Newman, ``Normal fluctuations and the FKG inequalities,'' Communications in Mathematical Physics, 74(2), 119-128, 1980. MR 576267 (81i:82070)
- 42.
- B. A. Olshausen and D. J. Field, ``Emergence of simple-cell receptive field properties by learning a sparse code for natural images,'' Nature, 381, 607-609, 1996.
- 43.
- B. A. Olshausen and K. J. Millman, ``Learning sparse codes with a mixture-of-Gaussians prior,'' Advances in Neural Information Processing Systems, 12, 841-847, 2000.
- 44.
- S. Osher, A. Sole, and L. Vese, ``Image decomposition and restoration using total variation minimization and the
norm,'' Multiscale Modeling and Simulation: A SIAM Interdisciplinary Journal, 1(3), 349-370, 2003. MR 2030155 (2004k:49004) - 45.
- A. Pece, ``The problem of sparse image coding,'' Journal of Mathematical Imaging and Vision, 17(2), 89-108, 2002. MR 1950863 (2004a:94008)
- 46.
- J. Portilla and E. P. Simoncelli, ``A parametric texture model based on joint statistics of complex wavelet coefficients,'' International Journal of Computer Vision, 40(1):49-71, 2000.
- 47.
- D. L. Ruderman and W. Bialek, ``Statistics of natural images: Scaling in the Woods,'' Physical Review Letters, 73, 1994.
- 48.
- C. E. Shannon, ``A mathematical theory of communication,'' Bell System Technical Journal, 27, 379-423, 623-656, 1948. MR 0026286 (10:133e)
- 49.
- E. P. Simoncelli and B. A. Olshausen, ``Natural image statistics and neural representation,'' Annual Review of Neuroscience, 24, 1193-1216, 2001.
- 50.
- A. Srivastava, A. Lee, E. Simoncelli, and S. Zhu, ``On advances in statistical modeling of natural images,'' Journal of Mathematical Imaging and Vision, 18(1), 17-33, 2003. MR 1966173
- 51.
- Z. Tu, ``Learning generative models via discriminative approaches,'' Proceedings of IEEE Computer Vision and Pattern Recognition (CVPR), 2007.
- 52.
- P. A. Viola and M. J. Jones, ``Robust real-time face detection,'' International Journal of Computer Vision, 57(2), 137-154, 2004.
- 53.
- A. Witkin, ``Scale-space filtering,'' Proceedings of International Joint Conference on Artificial Intelligence, Karlsruhe, 1983.
- 54.
- Y. N. Wu, S. C. Zhu, and C. Guo, ``Statistical modeling of texture sketch,'' Proceedings of European Conference of Computer Vision, 2002.
- 55.
- Y. N. Wu, S. C. Zhu, and X. W. Liu, ``Equivalence of Julesz ensemble and FRAME models,'' International Journal of Computer Vision, 38(3), 245-261, 2000.
- 56.
- R. A. Young, ``The Gaussian derivative model for spatial vision: I. Retinal mechanism,'' Spatial Vision, 2(4), 273-293, 1987.
- 57.
- S. C. Zhu, C. E. Guo, Y. Z. Wang, and Z. J. Xu, ``What are textons?'' International Journal of Computer Vision, 62(1/2), 121-143, 2005.
- 58.
- S. C. Zhu and D. B. Mumford, ``Prior learning and Gibbs reaction-diffusion,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(11), 1236-1250, 1997.
- 59.
- S. C. Zhu and D. B. Mumford, ``Quest for a stochastic grammar of images,'' Foundations and Trends in Computer Graphics and Vision, to appear.
- 60.
- S. C. Zhu, Y. N. Wu, and D. Mumford, ``Minimax entropy principle and its applications in texture modeling,'' Neural Computation, 9(8), 1627-1660, 1997.
Similar Articles:
Retrieve articles in Quarterly of Applied Mathematics
with MSC
(2000):
62M40
Retrieve articles in all Journals with MSC
(2000):
62M40
Additional Information:
Ying
Nian
Wu
Affiliation:
Department of Statistics, University of California, Los Angeles, California
Cheng-En
Guo
Affiliation:
Acuity Technologies, Menlo Park, California
Song-Chun
Zhu
Affiliation:
Departments of Statistics and Computer Science, University of California, Los Angeles, California
PII:
S0033-569X-07-01063-2
Received by editor(s):
January 20, 2007
Posted:
December 5, 2007
Copyright of article:
Copyright
2007,
Brown University
The copyright for this article reverts to public domain after 28 years from publication.
|