Quarterly of Applied Mathematics

Quarterly of Applied Mathematics

Online ISSN 1552-4485; Print ISSN 0033-569X

   
 
 

 

From information scaling of natural images to regimes of statistical models


Authors: Ying Nian Wu, Cheng-En Guo and Song-Chun Zhu
Journal: Quart. Appl. Math. 66 (2008), 81-122
MSC (2000): Primary 62M40
DOI: https://doi.org/10.1090/S0033-569X-07-01063-2
Published electronically: December 5, 2007
MathSciNet review: 2396653
Full-text PDF Free Access

Abstract | References | Similar Articles | Additional Information

Abstract: Vision can be considered a highly specialized data collection and analysis problem. We need to understand the special properties of natural image data in order to construct statistical models and develop statistical methods for representing and recognizing the wide variety of natural image patterns. One fundamental property of natural image data that distinguishes vision from other sensory tasks such as speech recognition is that scale plays a profound role in image formation and interpretation. Specifically, visual objects can appear at a wide range of scales in the images due to the change of viewing distance as well as camera resolution. The same objects appearing at different scales produce different image data with different statistical properties. In particular, we show that the entropy rate of the image data changes over scale. Moreover, the inferential uncertainty changes over scale too. We call these changes information scaling. We then examine both empirically and theoretically two prominent and yet largely isolated classes of image models, namely, wavelet sparse coding models and Markov random field models. Our results indicate that the two classes of models are appropriate for two different entropy regimes: sparse coding targets low entropy regimes, whereas Markov random fields are appropriate for high entropy regimes. Because information scaling connects different entropy regimes, both sparse coding and Markov random fields are necessary for representing natural image data, and information scaling triggers transitions between these two regimes. This motivates us to propose a modeling scheme that embraces both regimes of models in a common framework. The contribution of our work is two-fold. First, the study of information scaling provides a unifying perspective for the rich variety of natural image patterns. Second, the modeling scheme that we develop provides a natural integration of different regimes of image models.


References [Enhancements On Off] (What's this?)

  • 1. P. H. Algoet and T. M. Cover, ``A sandwich proof of the Shannon-McMillan-Breiman theorem,'' Annals of Probability, 16, 899-909, 1988. MR 929085 (89b:94011)
  • 2. L. Alvarez, Y. Gousseau, and J. M. Morel, ``The size of objects in natural and artificial images,'' Advances in Imaging and Electron Physics, 111, 167-242, 1999.
  • 3. A. R. Barron, ``The strong ergodic theorem for densities: Generalized Shannon-McMillan-Breiman theorem,'' Annals of Probability, 13, 1292-1303, 1985. MR 806226 (86k:94023)
  • 4. A. R. Barron, ``Entropy and the central limit theorem,'' Annals of Probability, 14, 336-342, 1986. MR 815975 (87h:60048)
  • 5. A. Bell, and T. J. Sejnowski, ``The `independent components' of natural scenes are edge filters,'' Vision Research, 37, 3327-3338, 1997.
  • 6. J. Besag, ``Spatial interaction and the statistical analysis of lattice systems (with discussion),'' Journal of Royal Statistics Society, B, 36, 192-236, 1974. MR 0373208 (51:9409)
  • 7. P. Burt and E. H. Adelson, ``The Laplacian pyramid as a compact image code,'' IEEE Transactions on Communication, 31, 532-540, 1983.
  • 8. E. J. Candès and D. L. Donoho, ``Curvelets - a surprisingly effective nonadaptive representation for objects with edges,'' Curves and Surfaces, L. L. Schumakeretal. (eds), Vanderbilt University Press, Nashville, TN, 1999.
  • 9. J. Canny, ``A computational approach to edge detection,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 679-698, 1986.
  • 10. T. F. Chan and J. Shen, ``Mathematical models for local nontexture inpaintings,'' SIAM Journal of Applied Mathematics, 62(3), 1019-1043, 2001. MR 1897733 (2003f:65110)
  • 11. D. Chandler, Introduction to Modern Statistical Mechanics, The Clarendon Press, Oxford University Press, New York, 1987. MR 913936 (89d:82001)
  • 12. T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley, New York, 1991. MR 1122806 (92g:94001)
  • 13. J. Daugman, ``Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters,'' Journal of Optical Society of America, 2, 1160-1169, 1985.
  • 14. S. Della Pietra, V. Della Pietra, and J. Lafferty, ``Inducing features of random fields,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4), 380-393, 1997.
  • 15. D. L. Donoho, M. Vetterli, R. A. DeVore, and I. Daubechies, ``Data compression and harmonic analysis,'' IEEE Trans. Information Theory. 6, 2435-2476, 1998. MR 1658775 (99i:94028)
  • 16. J. H. Elder and S. W. Zucker, ``Local scale control for edge detection and blur estimation,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(7), 699-716, 1998.
  • 17. D. J. Field, ``What is the goal of sensory coding?'' Neural Computation, 6, 559-601, 1994.
  • 18. Y. Freund and R. E. Schapire, ``A decision-theoretic generalization of on-line learning and an application to boosting,'' Journal of Computer and System Sciences, 55, 119-139, 1997. MR 1473055 (99g:68172)
  • 19. J. H. Friedman, ``Exploratory projection pursuit,'' Journal of the American Statistical Association, 82, 249, 1987. MR 883353 (88c:62004)
  • 20. S. Geman and D. Geman, ``Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 6, 721-741, 1984.
  • 21. S. Geman and C. Graffigne, ``Markov random field image models and their applications to computer vision,'' Proceedings of the International Congress of Mathematicians, 1, 1496-1517, 1987. MR 934354
  • 22. S. Geman, D. F. Potter, and Z. Chi, ``Composition system,'' Quarterly of Applied Math, 60(4), 707-736, 2002. MR 1939008 (2003i:68129)
  • 23. U. Grenander, General Pattern Theory, The Clarendon Press, Oxford Univ Press, New York, 1993. MR 1270904 (96e:68118)
  • 24. C. Guo, S. C. Zhu, and Y. N. Wu, ``Primal sketch: Integrating structure and texture,'' Computer Vision and Image Understanding, 106, 5-19, 2007.
  • 25. J. Hammersley and P. Clifford, Markov Fields on Finite Graphs and Lattices, Preprint, UC. Berkeley, 1968.
  • 26. D. J. Heeger and J. R. Bergen, ``Pyramid based texture analysis/synthesis,'' Computer Graphics Proceedings, 229-238, 1995.
  • 27. D. Huber and T. Wiesel, ``Receptive fields, binocular interaction and functional architecture in the cat's visual cortex,'' Journal of Physiology, 160, 1962.
  • 28. O. Johnson, ``An information theoretical central limit theorem for finitely susceptible FKG systems,'' technical report, 2004.
  • 29. T. S. Lee, ``Image representation using 2D Gabor wavelets,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 10, 959-971, 1996.
  • 30. M. S., Lewicki and B. A. Olshausen, ``Probabilistic framework for the adaptation and comparison of image codes,'' Journal of the Optical Society of America, 16(7), 1587-1601, 1999.
  • 31. T. Lindeberg, Scale-Space Theory in Computer Vision, Kluwer Academic Publishers, 1994.
  • 32. A. Lee, D. Mumford, and J. Huang, ``Occlusion models for natural images: A statistical study of a scale-invariant dead leaves model,'' International Journal of Computer Vision, 41(1/2), 35-59, 2001.
  • 33. S. Mallat, ``A theory of multiresolution signal decomposition: The wavelet representation,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7), 674-693, 1989.
  • 34. S. Mallat and Z. Zhang, ``Matching pursuit in a time-frequency dictionary,'' IEEE Transactions on Signal Processing, 41, 3397-415, 1993.
  • 35. B. B. Mandelbrot, The Fractal Geometry of Nature, Freeman, San Francisco, CA, 1982. MR 665254 (84h:00021)
  • 36. D. Marr, Vision, W. H. Freeman and Company, San Francisco, CA, 1982.
  • 37. S. G. Matheron, Random Sets and Integral Geometry, John Wiley and Sons, 1975. MR 0385969 (52:6828)
  • 38. L. Moisan, A. Desolneux, and J.-M. Morel, ``Meaningful alignments,'' International Journal of Computer Vision, 40, 1, 7-23, 2000.
  • 39. D. B. Mumford, ``Pattern theory: A unifying perspective,'' Proceedings of 1st European Congress of Mathematics, Birkhäuser-Boston, 1994. MR 1341824
  • 40. D. Mumford and B. Gidas, ``Stochastic models for generic images'', Quarterly of Applied Math, 59(1), 85-111, 2001. MR 1811096 (2001m:68166)
  • 41. C. M. Newman, ``Normal fluctuations and the FKG inequalities,'' Communications in Mathematical Physics, 74(2), 119-128, 1980. MR 576267 (81i:82070)
  • 42. B. A. Olshausen and D. J. Field, ``Emergence of simple-cell receptive field properties by learning a sparse code for natural images,'' Nature, 381, 607-609, 1996.
  • 43. B. A. Olshausen and K. J. Millman, ``Learning sparse codes with a mixture-of-Gaussians prior,'' Advances in Neural Information Processing Systems, 12, 841-847, 2000.
  • 44. S. Osher, A. Sole, and L. Vese, ``Image decomposition and restoration using total variation minimization and the $ H^{-1}$ norm,'' Multiscale Modeling and Simulation: A SIAM Interdisciplinary Journal, 1(3), 349-370, 2003. MR 2030155 (2004k:49004)
  • 45. A. Pece, ``The problem of sparse image coding,'' Journal of Mathematical Imaging and Vision, 17(2), 89-108, 2002. MR 1950863 (2004a:94008)
  • 46. J. Portilla and E. P. Simoncelli, ``A parametric texture model based on joint statistics of complex wavelet coefficients,'' International Journal of Computer Vision, 40(1):49-71, 2000.
  • 47. D. L. Ruderman and W. Bialek, ``Statistics of natural images: Scaling in the Woods,'' Physical Review Letters, 73, 1994.
  • 48. C. E. Shannon, ``A mathematical theory of communication,'' Bell System Technical Journal, 27, 379-423, 623-656, 1948. MR 0026286 (10:133e)
  • 49. E. P. Simoncelli and B. A. Olshausen, ``Natural image statistics and neural representation,'' Annual Review of Neuroscience, 24, 1193-1216, 2001.
  • 50. A. Srivastava, A. Lee, E. Simoncelli, and S. Zhu, ``On advances in statistical modeling of natural images,'' Journal of Mathematical Imaging and Vision, 18(1), 17-33, 2003. MR 1966173
  • 51. Z. Tu, ``Learning generative models via discriminative approaches,'' Proceedings of IEEE Computer Vision and Pattern Recognition (CVPR), 2007.
  • 52. P. A. Viola and M. J. Jones, ``Robust real-time face detection,'' International Journal of Computer Vision, 57(2), 137-154, 2004.
  • 53. A. Witkin, ``Scale-space filtering,'' Proceedings of International Joint Conference on Artificial Intelligence, Karlsruhe, 1983.
  • 54. Y. N. Wu, S. C. Zhu, and C. Guo, ``Statistical modeling of texture sketch,'' Proceedings of European Conference of Computer Vision, 2002.
  • 55. Y. N. Wu, S. C. Zhu, and X. W. Liu, ``Equivalence of Julesz ensemble and FRAME models,'' International Journal of Computer Vision, 38(3), 245-261, 2000.
  • 56. R. A. Young, ``The Gaussian derivative model for spatial vision: I. Retinal mechanism,'' Spatial Vision, 2(4), 273-293, 1987.
  • 57. S. C. Zhu, C. E. Guo, Y. Z. Wang, and Z. J. Xu, ``What are textons?'' International Journal of Computer Vision, 62(1/2), 121-143, 2005.
  • 58. S. C. Zhu and D. B. Mumford, ``Prior learning and Gibbs reaction-diffusion,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(11), 1236-1250, 1997.
  • 59. S. C. Zhu and D. B. Mumford, ``Quest for a stochastic grammar of images,'' Foundations and Trends in Computer Graphics and Vision, to appear.
  • 60. S. C. Zhu, Y. N. Wu, and D. Mumford, ``Minimax entropy principle and its applications in texture modeling,'' Neural Computation, 9(8), 1627-1660, 1997.

Similar Articles

Retrieve articles in Quarterly of Applied Mathematics with MSC (2000): 62M40

Retrieve articles in all journals with MSC (2000): 62M40


Additional Information

Ying Nian Wu
Affiliation: Department of Statistics, University of California, Los Angeles, California

Cheng-En Guo
Affiliation: Acuity Technologies, Menlo Park, California

Song-Chun Zhu
Affiliation: Departments of Statistics and Computer Science, University of California, Los Angeles, California

DOI: https://doi.org/10.1090/S0033-569X-07-01063-2
Received by editor(s): January 20, 2007
Published electronically: December 5, 2007
Article copyright: © Copyright 2007 Brown University
The copyright for this article reverts to public domain 28 years after publication.

American Mathematical Society