Quarterly of Applied Mathematics

Quarterly of Applied Mathematics

Online ISSN 1552-4485; Print ISSN 0033-569X



From information scaling of natural images to regimes of statistical models

Authors: Ying Nian Wu, Cheng-En Guo and Song-Chun Zhu
Journal: Quart. Appl. Math. 66 (2008), 81-122
MSC (2000): Primary 62M40
DOI: https://doi.org/10.1090/S0033-569X-07-01063-2
Published electronically: December 5, 2007
MathSciNet review: 2396653
Full-text PDF

Abstract | References | Similar Articles | Additional Information

Abstract: Vision can be considered a highly specialized data collection and analysis problem. We need to understand the special properties of natural image data in order to construct statistical models and develop statistical methods for representing and recognizing the wide variety of natural image patterns. One fundamental property of natural image data that distinguishes vision from other sensory tasks such as speech recognition is that scale plays a profound role in image formation and interpretation. Specifically, visual objects can appear at a wide range of scales in the images due to the change of viewing distance as well as camera resolution. The same objects appearing at different scales produce different image data with different statistical properties. In particular, we show that the entropy rate of the image data changes over scale. Moreover, the inferential uncertainty changes over scale too. We call these changes information scaling. We then examine both empirically and theoretically two prominent and yet largely isolated classes of image models, namely, wavelet sparse coding models and Markov random field models. Our results indicate that the two classes of models are appropriate for two different entropy regimes: sparse coding targets low entropy regimes, whereas Markov random fields are appropriate for high entropy regimes. Because information scaling connects different entropy regimes, both sparse coding and Markov random fields are necessary for representing natural image data, and information scaling triggers transitions between these two regimes. This motivates us to propose a modeling scheme that embraces both regimes of models in a common framework. The contribution of our work is two-fold. First, the study of information scaling provides a unifying perspective for the rich variety of natural image patterns. Second, the modeling scheme that we develop provides a natural integration of different regimes of image models.

References [Enhancements On Off] (What's this?)

  • 1. Paul H. Algoet and Thomas M. Cover, A sandwich proof of the Shannon-McMillan-Breiman theorem, Ann. Probab. 16 (1988), no. 2, 899–909. MR 929085
  • 2. L. Alvarez, Y. Gousseau, and J. M. Morel, ``The size of objects in natural and artificial images,'' Advances in Imaging and Electron Physics, 111, 167-242, 1999.
  • 3. Andrew R. Barron, The strong ergodic theorem for densities: generalized Shannon-McMillan-Breiman theorem, Ann. Probab. 13 (1985), no. 4, 1292–1303. MR 806226
  • 4. Andrew R. Barron, Entropy and the central limit theorem, Ann. Probab. 14 (1986), no. 1, 336–342. MR 815975
  • 5. A. Bell, and T. J. Sejnowski, ``The `independent components' of natural scenes are edge filters,'' Vision Research, 37, 3327-3338, 1997.
  • 6. Julian Besag, Spatial interaction and the statistical analysis of lattice systems, J. Roy. Statist. Soc. Ser. B 36 (1974), 192–236. With discussion by D. R. Cox, A. G. Hawkes, P. Clifford, P. Whittle, K. Ord, R. Mead, J. M. Hammersley, and M. S. Bartlett and with a reply by the author. MR 0373208
  • 7. P. Burt and E. H. Adelson, ``The Laplacian pyramid as a compact image code,'' IEEE Transactions on Communication, 31, 532-540, 1983.
  • 8. E. J. Candès and D. L. Donoho, ``Curvelets - a surprisingly effective nonadaptive representation for objects with edges,'' Curves and Surfaces, L. L. Schumakeretal. (eds), Vanderbilt University Press, Nashville, TN, 1999.
  • 9. J. Canny, ``A computational approach to edge detection,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 679-698, 1986.
  • 10. Tony F. Chan and Jianhong Shen, Mathematical models for local nontexture inpaintings, SIAM J. Appl. Math. 62 (2001/02), no. 3, 1019–1043. MR 1897733, https://doi.org/10.1137/S0036139900368844
  • 11. David Chandler, Introduction to modern statistical mechanics, The Clarendon Press, Oxford University Press, New York, 1987. MR 913936
  • 12. Thomas M. Cover and Joy A. Thomas, Elements of information theory, Wiley Series in Telecommunications, John Wiley & Sons, Inc., New York, 1991. A Wiley-Interscience Publication. MR 1122806
  • 13. J. Daugman, ``Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters,'' Journal of Optical Society of America, 2, 1160-1169, 1985.
  • 14. S. Della Pietra, V. Della Pietra, and J. Lafferty, ``Inducing features of random fields,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4), 380-393, 1997.
  • 15. David L. Donoho, Martin Vetterli, R. A. DeVore, and Ingrid Daubechies, Data compression and harmonic analysis, IEEE Trans. Inform. Theory 44 (1998), no. 6, 2435–2476. Information theory: 1948–1998. MR 1658775, https://doi.org/10.1109/18.720544
  • 16. J. H. Elder and S. W. Zucker, ``Local scale control for edge detection and blur estimation,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(7), 699-716, 1998.
  • 17. D. J. Field, ``What is the goal of sensory coding?'' Neural Computation, 6, 559-601, 1994.
  • 18. Yoav Freund and Robert E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, J. Comput. System Sci. 55 (1997), no. 1, 119–139. Second Annual European Conference on Computational Learning Theory (EuroCOLT ’95) (Barcelona, 1995). MR 1473055, https://doi.org/10.1006/jcss.1997.1504
  • 19. Jerome H. Friedman, Exploratory projection pursuit, J. Amer. Statist. Assoc. 82 (1987), no. 397, 249–266. MR 883353
  • 20. S. Geman and D. Geman, ``Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721-741, 1984.
  • 21. Stuart Geman and Christine Graffigne, Markov random field image models and their applications to computer vision, Proceedings of the International Congress of Mathematicians, Vol. 1, 2 (Berkeley, Calif., 1986) Amer. Math. Soc., Providence, RI, 1987, pp. 1496–1517. MR 934354
  • 22. Stuart Geman, Daniel F. Potter, and Zhiyi Chi, Composition systems, Quart. Appl. Math. 60 (2002), no. 4, 707–736. MR 1939008, https://doi.org/10.1090/qam/1939008
  • 23. Ulf Grenander, General pattern theory, Oxford Mathematical Monographs, The Clarendon Press, Oxford University Press, New York, 1993. A mathematical study of regular structures; Oxford Science Publications. MR 1270904
  • 24. C. Guo, S. C. Zhu, and Y. N. Wu, ``Primal sketch: Integrating structure and texture,'' Computer Vision and Image Understanding, 106, 5-19, 2007.
  • 25. J. Hammersley and P. Clifford, Markov Fields on Finite Graphs and Lattices, Preprint, UC. Berkeley, 1968.
  • 26. D. J. Heeger and J. R. Bergen, ``Pyramid based texture analysis/synthesis,'' Computer Graphics Proceedings, 229-238, 1995.
  • 27. D. Huber and T. Wiesel, ``Receptive fields, binocular interaction and functional architecture in the cat's visual cortex,'' Journal of Physiology, 160, 1962.
  • 28. O. Johnson, ``An information theoretical central limit theorem for finitely susceptible FKG systems,'' technical report, 2004.
  • 29. T. S. Lee, ``Image representation using 2D Gabor wavelets,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 10, 959-971, 1996.
  • 30. M. S., Lewicki and B. A. Olshausen, ``Probabilistic framework for the adaptation and comparison of image codes,'' Journal of the Optical Society of America, 16(7), 1587-1601, 1999.
  • 31. T. Lindeberg, Scale-Space Theory in Computer Vision, Kluwer Academic Publishers, 1994.
  • 32. A. Lee, D. Mumford, and J. Huang, ``Occlusion models for natural images: A statistical study of a scale-invariant dead leaves model,'' International Journal of Computer Vision, 41(1/2), 35-59, 2001.
  • 33. S. Mallat, ``A theory of multiresolution signal decomposition: The wavelet representation,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7), 674-693, 1989.
  • 34. S. Mallat and Z. Zhang, ``Matching pursuit in a time-frequency dictionary,'' IEEE Transactions on Signal Processing, 41, 3397-415, 1993.
  • 35. Benoit B. Mandelbrot, The fractal geometry of nature, W. H. Freeman and Co., San Francisco, Calif., 1982. Schriftenreihe für den Referenten. [Series for the Referee]. MR 665254
  • 36. D. Marr, Vision, W. H. Freeman and Company, San Francisco, CA, 1982.
  • 37. G. Matheron, Random sets and integral geometry, John Wiley & Sons, New York-London-Sydney, 1975. With a foreword by Geoffrey S. Watson; Wiley Series in Probability and Mathematical Statistics. MR 0385969
  • 38. L. Moisan, A. Desolneux, and J.-M. Morel, ``Meaningful alignments,'' International Journal of Computer Vision, 40, 1, 7-23, 2000.
  • 39. David Mumford, Pattern theory: a unifying perspective, First European Congress of Mathematics, Vol. I (Paris, 1992) Progr. Math., vol. 119, Birkhäuser, Basel, 1994, pp. 187–224. MR 1341824
  • 40. David Mumford and Basilis Gidas, Stochastic models for generic images, Quart. Appl. Math. 59 (2001), no. 1, 85–111. MR 1811096, https://doi.org/10.1090/qam/1811096
  • 41. C. M. Newman, Normal fluctuations and the FKG inequalities, Comm. Math. Phys. 74 (1980), no. 2, 119–128. MR 576267
  • 42. B. A. Olshausen and D. J. Field, ``Emergence of simple-cell receptive field properties by learning a sparse code for natural images,'' Nature, 381, 607-609, 1996.
  • 43. B. A. Olshausen and K. J. Millman, ``Learning sparse codes with a mixture-of-Gaussians prior,'' Advances in Neural Information Processing Systems, 12, 841-847, 2000.
  • 44. Stanley Osher, Andrés Solé, and Luminita Vese, Image decomposition and restoration using total variation minimization and the 𝐻⁻¹ norm, Multiscale Model. Simul. 1 (2003), no. 3, 349–370. MR 2030155, https://doi.org/10.1137/S1540345902416247
  • 45. Arthur E. C. Pece, The problem of sparse image coding, J. Math. Imaging Vision 17 (2002), no. 2, 89–108. Special issue on statistics of shapes and textures. MR 1950863, https://doi.org/10.1023/A:1020677318841
  • 46. J. Portilla and E. P. Simoncelli, ``A parametric texture model based on joint statistics of complex wavelet coefficients,'' International Journal of Computer Vision, 40(1):49-71, 2000.
  • 47. D. L. Ruderman and W. Bialek, ``Statistics of natural images: Scaling in the Woods,'' Physical Review Letters, 73, 1994.
  • 48. C. E. Shannon, A mathematical theory of communication, Bell System Tech. J. 27 (1948), 379–423, 623–656. MR 0026286, https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
  • 49. E. P. Simoncelli and B. A. Olshausen, ``Natural image statistics and neural representation,'' Annual Review of Neuroscience, 24, 1193-1216, 2001.
  • 50. A. Srivastava, A. B. Lee, E. P. Simoncelli, and S.-C. Zhu, On advances in statistical modeling of natural images, J. Math. Imaging Vision 18 (2003), no. 1, 17–33. Special issue on imaging science (Boston, MA, 2002). MR 1966173, https://doi.org/10.1023/A:1021889010444
  • 51. Z. Tu, ``Learning generative models via discriminative approaches,'' Proceedings of IEEE Computer Vision and Pattern Recognition (CVPR), 2007.
  • 52. P. A. Viola and M. J. Jones, ``Robust real-time face detection,'' International Journal of Computer Vision, 57(2), 137-154, 2004.
  • 53. A. Witkin, ``Scale-space filtering,'' Proceedings of International Joint Conference on Artificial Intelligence, Karlsruhe, 1983.
  • 54. Y. N. Wu, S. C. Zhu, and C. Guo, ``Statistical modeling of texture sketch,'' Proceedings of European Conference of Computer Vision, 2002.
  • 55. Y. N. Wu, S. C. Zhu, and X. W. Liu, ``Equivalence of Julesz ensemble and FRAME models,'' International Journal of Computer Vision, 38(3), 245-261, 2000.
  • 56. R. A. Young, ``The Gaussian derivative model for spatial vision: I. Retinal mechanism,'' Spatial Vision, 2(4), 273-293, 1987.
  • 57. S. C. Zhu, C. E. Guo, Y. Z. Wang, and Z. J. Xu, ``What are textons?'' International Journal of Computer Vision, 62(1/2), 121-143, 2005.
  • 58. S. C. Zhu and D. B. Mumford, ``Prior learning and Gibbs reaction-diffusion,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(11), 1236-1250, 1997.
  • 59. S. C. Zhu and D. B. Mumford, ``Quest for a stochastic grammar of images,'' Foundations and Trends in Computer Graphics and Vision, to appear.
  • 60. S. C. Zhu, Y. N. Wu, and D. Mumford, ``Minimax entropy principle and its applications in texture modeling,'' Neural Computation, 9(8), 1627-1660, 1997.

Similar Articles

Retrieve articles in Quarterly of Applied Mathematics with MSC (2000): 62M40

Retrieve articles in all journals with MSC (2000): 62M40

Additional Information

Ying Nian Wu
Affiliation: Department of Statistics, University of California, Los Angeles, California

Cheng-En Guo
Affiliation: Acuity Technologies, Menlo Park, California

Song-Chun Zhu
Affiliation: Departments of Statistics and Computer Science, University of California, Los Angeles, California

DOI: https://doi.org/10.1090/S0033-569X-07-01063-2
Received by editor(s): January 20, 2007
Published electronically: December 5, 2007
Article copyright: © Copyright 2007 Brown University
The copyright for this article reverts to public domain 28 years after publication.

American Mathematical Society