Skip to Main Content
Quarterly of Applied Mathematics

Quarterly of Applied Mathematics

Online ISSN 1552-4485; Print ISSN 0033-569X

   
 
 

 

From information scaling of natural images to regimes of statistical models


Authors: Ying Nian Wu, Cheng-En Guo and Song-Chun Zhu
Journal: Quart. Appl. Math. 66 (2008), 81-122
MSC (2000): Primary 62M40
DOI: https://doi.org/10.1090/S0033-569X-07-01063-2
Published electronically: December 5, 2007
MathSciNet review: 2396653
Full-text PDF Free Access

Abstract | References | Similar Articles | Additional Information

Abstract: Vision can be considered a highly specialized data collection and analysis problem. We need to understand the special properties of natural image data in order to construct statistical models and develop statistical methods for representing and recognizing the wide variety of natural image patterns. One fundamental property of natural image data that distinguishes vision from other sensory tasks such as speech recognition is that scale plays a profound role in image formation and interpretation. Specifically, visual objects can appear at a wide range of scales in the images due to the change of viewing distance as well as camera resolution. The same objects appearing at different scales produce different image data with different statistical properties. In particular, we show that the entropy rate of the image data changes over scale. Moreover, the inferential uncertainty changes over scale too. We call these changes information scaling. We then examine both empirically and theoretically two prominent and yet largely isolated classes of image models, namely, wavelet sparse coding models and Markov random field models. Our results indicate that the two classes of models are appropriate for two different entropy regimes: sparse coding targets low entropy regimes, whereas Markov random fields are appropriate for high entropy regimes. Because information scaling connects different entropy regimes, both sparse coding and Markov random fields are necessary for representing natural image data, and information scaling triggers transitions between these two regimes. This motivates us to propose a modeling scheme that embraces both regimes of models in a common framework. The contribution of our work is two-fold. First, the study of information scaling provides a unifying perspective for the rich variety of natural image patterns. Second, the modeling scheme that we develop provides a natural integration of different regimes of image models.


References [Enhancements On Off] (What's this?)

References
  • Paul H. Algoet and Thomas M. Cover, A sandwich proof of the Shannon-McMillan-Breiman theorem, Ann. Probab. 16 (1988), no. 2, 899–909. MR 929085
  • L. Alvarez, Y. Gousseau, and J. M. Morel, “The size of objects in natural and artificial images,” Advances in Imaging and Electron Physics, 111, 167-242, 1999.
  • Andrew R. Barron, The strong ergodic theorem for densities: generalized Shannon-McMillan-Breiman theorem, Ann. Probab. 13 (1985), no. 4, 1292–1303. MR 806226
  • Andrew R. Barron, Entropy and the central limit theorem, Ann. Probab. 14 (1986), no. 1, 336–342. MR 815975
  • A. Bell, and T. J. Sejnowski, “The ‘independent components’ of natural scenes are edge filters,” Vision Research, 37, 3327-3338, 1997.
  • Julian Besag, Spatial interaction and the statistical analysis of lattice systems, J. Roy. Statist. Soc. Ser. B 36 (1974), 192–236. MR 373208
  • P. Burt and E. H. Adelson, “The Laplacian pyramid as a compact image code,” IEEE Transactions on Communication, 31, 532-540, 1983.
  • E. J. Candès and D. L. Donoho, “Curvelets - a surprisingly effective nonadaptive representation for objects with edges,” Curves and Surfaces, L. L. Schumakeretal. (eds), Vanderbilt University Press, Nashville, TN, 1999.
  • J. Canny, “A computational approach to edge detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 679–698, 1986.
  • Tony F. Chan and Jianhong Shen, Mathematical models for local nontexture inpaintings, SIAM J. Appl. Math. 62 (2001/02), no. 3, 1019–1043. MR 1897733, DOI https://doi.org/10.1137/S0036139900368844
  • David Chandler, Introduction to modern statistical mechanics, The Clarendon Press, Oxford University Press, New York, 1987. MR 913936
  • Thomas M. Cover and Joy A. Thomas, Elements of information theory, Wiley Series in Telecommunications, John Wiley & Sons, Inc., New York, 1991. A Wiley-Interscience Publication. MR 1122806
  • J. Daugman, “Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters,” Journal of Optical Society of America, 2, 1160-1169, 1985.
  • S. Della Pietra, V. Della Pietra, and J. Lafferty, “Inducing features of random fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4), 380–393, 1997.
  • David L. Donoho, Martin Vetterli, R. A. DeVore, and Ingrid Daubechies, Data compression and harmonic analysis, IEEE Trans. Inform. Theory 44 (1998), no. 6, 2435–2476. Information theory: 1948–1998. MR 1658775, DOI https://doi.org/10.1109/18.720544
  • J. H. Elder and S. W. Zucker, “Local scale control for edge detection and blur estimation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(7), 699-716, 1998.
  • D. J. Field, “What is the goal of sensory coding?” Neural Computation, 6, 559-601, 1994.
  • Yoav Freund and Robert E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, J. Comput. System Sci. 55 (1997), no. 1, 119–139. Second Annual European Conference on Computational Learning Theory (EuroCOLT ’95) (Barcelona, 1995). MR 1473055, DOI https://doi.org/10.1006/jcss.1997.1504
  • Jerome H. Friedman, Exploratory projection pursuit, J. Amer. Statist. Assoc. 82 (1987), no. 397, 249–266. MR 883353
  • S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721-741, 1984.
  • Stuart Geman and Christine Graffigne, Markov random field image models and their applications to computer vision, Proceedings of the International Congress of Mathematicians, Vol. 1, 2 (Berkeley, Calif., 1986) Amer. Math. Soc., Providence, RI, 1987, pp. 1496–1517. MR 934354
  • Stuart Geman, Daniel F. Potter, and Zhiyi Chi, Composition systems, Quart. Appl. Math. 60 (2002), no. 4, 707–736. MR 1939008, DOI https://doi.org/10.1090/qam/1939008
  • Ulf Grenander, General pattern theory, Oxford Mathematical Monographs, The Clarendon Press, Oxford University Press, New York, 1993. A mathematical study of regular structures; Oxford Science Publications. MR 1270904
  • C. Guo, S. C. Zhu, and Y. N. Wu, “Primal sketch: Integrating structure and texture,” Computer Vision and Image Understanding, 106, 5-19, 2007.
  • J. Hammersley and P. Clifford, Markov Fields on Finite Graphs and Lattices, Preprint, UC. Berkeley, 1968.
  • D. J. Heeger and J. R. Bergen, “Pyramid based texture analysis/synthesis,” Computer Graphics Proceedings, 229-238, 1995.
  • D. Huber and T. Wiesel, “Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex,” Journal of Physiology, 160, 1962.
  • O. Johnson, “An information theoretical central limit theorem for finitely susceptible FKG systems,” technical report, 2004.
  • T. S. Lee, “Image representation using 2D Gabor wavelets,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 10, 959-971, 1996.
  • M. S., Lewicki and B. A. Olshausen, “Probabilistic framework for the adaptation and comparison of image codes,” Journal of the Optical Society of America, 16(7), 1587-1601, 1999.
  • T. Lindeberg, Scale-Space Theory in Computer Vision, Kluwer Academic Publishers, 1994.
  • A. Lee, D. Mumford, and J. Huang, “Occlusion models for natural images: A statistical study of a scale-invariant dead leaves model,” International Journal of Computer Vision, 41(1/2), 35-59, 2001.
  • S. Mallat, “A theory of multiresolution signal decomposition: The wavelet representation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7), 674-693, 1989.
  • S. Mallat and Z. Zhang, “Matching pursuit in a time-frequency dictionary,” IEEE Transactions on Signal Processing, 41, 3397-415, 1993.
  • Benoit B. Mandelbrot, The fractal geometry of nature, W. H. Freeman and Co., San Francisco, Calif., 1982. Schriftenreihe für den Referenten. [Series for the Referee]. MR 665254
  • D. Marr, Vision, W. H. Freeman and Company, San Francisco, CA, 1982.
  • G. Matheron, Random sets and integral geometry, John Wiley & Sons, New York-London-Sydney, 1975. With a foreword by Geoffrey S. Watson; Wiley Series in Probability and Mathematical Statistics. MR 0385969
  • L. Moisan, A. Desolneux, and J.-M. Morel, “Meaningful alignments,” International Journal of Computer Vision, 40, 1, 7-23, 2000.
  • David Mumford, Pattern theory: a unifying perspective, First European Congress of Mathematics, Vol. I (Paris, 1992) Progr. Math., vol. 119, Birkhäuser, Basel, 1994, pp. 187–224. MR 1341824
  • David Mumford and Basilis Gidas, Stochastic models for generic images, Quart. Appl. Math. 59 (2001), no. 1, 85–111. MR 1811096, DOI https://doi.org/10.1090/qam/1811096
  • C. M. Newman, Normal fluctuations and the FKG inequalities, Comm. Math. Phys. 74 (1980), no. 2, 119–128. MR 576267
  • B. A. Olshausen and D. J. Field, “Emergence of simple-cell receptive field properties by learning a sparse code for natural images,” Nature, 381, 607-609, 1996.
  • B. A. Olshausen and K. J. Millman, “Learning sparse codes with a mixture-of-Gaussians prior,” Advances in Neural Information Processing Systems, 12, 841-847, 2000.
  • Stanley Osher, Andrés Solé, and Luminita Vese, Image decomposition and restoration using total variation minimization and the $H^{-1}$ norm, Multiscale Model. Simul. 1 (2003), no. 3, 349–370. MR 2030155, DOI https://doi.org/10.1137/S1540345902416247
  • Arthur E. C. Pece, The problem of sparse image coding, J. Math. Imaging Vision 17 (2002), no. 2, 89–108. Special issue on statistics of shapes and textures. MR 1950863, DOI https://doi.org/10.1023/A%3A1020677318841
  • J. Portilla and E. P. Simoncelli, “A parametric texture model based on joint statistics of complex wavelet coefficients,” International Journal of Computer Vision, 40(1):49-71, 2000.
  • D. L. Ruderman and W. Bialek, “Statistics of natural images: Scaling in the Woods,” Physical Review Letters, 73, 1994.
  • C. E. Shannon, A mathematical theory of communication, Bell System Tech. J. 27 (1948), 379–423, 623–656. MR 26286, DOI https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
  • E. P. Simoncelli and B. A. Olshausen, “Natural image statistics and neural representation,” Annual Review of Neuroscience, 24, 1193-1216, 2001.
  • A. Srivastava, A. B. Lee, E. P. Simoncelli, and S.-C. Zhu, On advances in statistical modeling of natural images, J. Math. Imaging Vision 18 (2003), no. 1, 17–33. Special issue on imaging science (Boston, MA, 2002). MR 1966173, DOI https://doi.org/10.1023/A%3A1021889010444
  • Z. Tu, “Learning generative models via discriminative approaches,” Proceedings of IEEE Computer Vision and Pattern Recognition (CVPR), 2007.
  • P. A. Viola and M. J. Jones, “Robust real-time face detection,” International Journal of Computer Vision, 57(2), 137-154, 2004.
  • A. Witkin, “Scale-space filtering,” Proceedings of International Joint Conference on Artificial Intelligence, Karlsruhe, 1983.
  • Y. N. Wu, S. C. Zhu, and C. Guo, “Statistical modeling of texture sketch,” Proceedings of European Conference of Computer Vision, 2002.
  • Y. N. Wu, S. C. Zhu, and X. W. Liu, “Equivalence of Julesz ensemble and FRAME models,” International Journal of Computer Vision, 38(3), 245-261, 2000.
  • R. A. Young, “The Gaussian derivative model for spatial vision: I. Retinal mechanism,” Spatial Vision, 2(4), 273-293, 1987.
  • S. C. Zhu, C. E. Guo, Y. Z. Wang, and Z. J. Xu, “What are textons?” International Journal of Computer Vision, 62(1/2), 121-143, 2005.
  • S. C. Zhu and D. B. Mumford, “Prior learning and Gibbs reaction-diffusion,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(11), 1236-1250, 1997.
  • S. C. Zhu and D. B. Mumford, “Quest for a stochastic grammar of images,” Foundations and Trends in Computer Graphics and Vision, to appear.
  • S. C. Zhu, Y. N. Wu, and D. Mumford, “Minimax entropy principle and its applications in texture modeling,” Neural Computation, 9(8), 1627-1660, 1997.

Similar Articles

Retrieve articles in Quarterly of Applied Mathematics with MSC (2000): 62M40

Retrieve articles in all journals with MSC (2000): 62M40


Additional Information

Ying Nian Wu
Affiliation: Department of Statistics, University of California, Los Angeles, California
MR Author ID: 360276

Cheng-En Guo
Affiliation: Acuity Technologies, Menlo Park, California

Song-Chun Zhu
Affiliation: Departments of Statistics and Computer Science, University of California, Los Angeles, California
MR Author ID: 712282

Received by editor(s): January 20, 2007
Published electronically: December 5, 2007
Article copyright: © Copyright 2007 Brown University
The copyright for this article reverts to public domain 28 years after publication.