Skip to main content
Log in

Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

The problems of dense stereo reconstruction and object class segmentation can both be formulated as Random Field labeling problems, in which every pixel in the image is assigned a label corresponding to either its disparity, or an object class such as road or building. While these two problems are mutually informative, no attempt has been made to jointly optimize their labelings. In this work we provide a flexible framework configured via cross-validation that unifies the two problems and demonstrate that, by resolving ambiguities, which would be present in real world data if the two problems were considered separately, joint optimization of the two problems substantially improves performance. To evaluate our method, we augment the Leuven data set (http://cms.brookes.ac.uk/research/visiongroup/files/Leuven.zip), which is a stereo video shot from a car driving around the streets of Leuven, with 70 hand labeled object class and disparity maps. We hope that the release of these annotations will stimulate further work in the challenging domain of street-view analysis. Complete source code is publicly available (http://cms.brookes.ac.uk/staff/Philip-Torr/ale.htm).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Alahari, K., Russell, C., & Torr, P. H. S. (2010). Efficient piecewise learning for conditional random fields. In Conference on computer vision and pattern recognition.

    Google Scholar 

  • Batra, D., Sukthankar, R., & Tsuhan, C. (2008). Learning class-specific affinities for image labelling. In Conference on computer vision and pattern recognition.

    Google Scholar 

  • Bleyer, M., Rother, C., Kohli, P., Scharstein, D., & Sinha, S. (2011). Object stereo—joint stereo matching and object segmentation. In Conference on computer vision and pattern recognition.

    Google Scholar 

  • Boykov, Y., & Jolly, M. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In International conference on computer vision.

    Google Scholar 

  • Boykov, Y., & Kolmogorov, V. (2004). An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. Transactions on Pattern Analysis and Machine Intelligence.

  • Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. Transactions on Pattern Analysis and Machine Intelligence.

  • Brostow, G. J., Shotton, J., Fauqueur, J., & Cipolla, R. (2008). Segmentation and recognition using structure from motion point clouds. In European conference on computer vision.

    Google Scholar 

  • Comaniciu, D., & Meer, P. (2002). Mean shift: a robust approach toward feature space analysis. Transactions on Pattern Analysis and Machine Intelligence.

  • Dick, A. R., Torr, P. H. S., & Cipolla, R. (2004). Modelling and interpretation of architecture from several images. International Journal of Computer Vision.

  • Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM.

  • Gould, S., Fulton, R., & Koller, D. (2009). Decomposing a scene into geometric and semantically consistent regions. In International conference on computer vision.

    Google Scholar 

  • Hoiem, D., Efros, A., & Hebert, M. (2005) Automatic photo pop-up. ACM Transactions on Graphics.

  • Hoiem, D., Efros, A., & Hebert, M. (2006). Putting objects in perspective. In Conference on computer vision and pattern recognition.

    Google Scholar 

  • Hoiem, D., Rother, C., & Winn, J. M. (2007). 3D layout CRF for multi-view object class recognition and segmentation. In Conference on computer vision and pattern recognition.

    Google Scholar 

  • Kohli, P., Kumar, M., & Torr, P. H. S. (2007). P 3 and beyond: solving energies with higher order cliques. In Conference on computer vision and pattern recognition.

    Google Scholar 

  • Kohli, P., Ladicky, L., & Torr, P. H. S. (2008). Robust higher order potentials for enforcing label consistency. In Conference on computer vision and pattern recognition.

    Google Scholar 

  • Kolmogorov, V., & Zabih, R. (2001). Computing visual correspondence with occlusions via graph cuts. In ICCV.

    Google Scholar 

  • Kumar, M. P., Veksler, O., & Torr, P. H. S. (2011). Improved moves for truncated convex models. Journal of Machine Learning Research.

  • Ladicky, L., Russell, C., Kohli, P., & Torr, P. H. S. (2009). Associative hierarchical CRFs for object class image segmentation. In International conference on computer vision.

    Google Scholar 

  • Leibe, B., Cornelis, N., Cornelis, K., & Gool, L. V. (2007). Dynamic 3D scene analysis from a moving vehicle. In Conference on computer vision and pattern recognition.

    Google Scholar 

  • Liu, B., Gould, S., & Koller, D. (2010). Single image depth estimation from predicted semantic labels. In Conference on computer vision and pattern recognition.

    Google Scholar 

  • Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision.

  • Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., & Belongie, S. (2007). Objects in context. In International conference on computer vision.

    Google Scholar 

  • Ramalingam, S., Kohli, P., Alahari, K., & Torr, P. H. S. (2008). Exact inference in multi-label CRFs with higher order cliques. In Conference on computer vision and pattern recognition.

    Google Scholar 

  • Rother, C., Kolmogorov, V., & Blake, A. (2004). Grabcut: interactive foreground extraction using iterated graph cuts. In SIGGRAPH.

    Google Scholar 

  • Russell, C., Ladicky, L., Kohli, P., & Torr, P. H. S. (2010). Exact and approximate inference in associative hierarchical networks using graph cuts. Uncertainty in Artificial Intelligence.

  • Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision.

  • Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. Transactions on Pattern Analysis and Machine Intelligence.

  • Shotton, J., Winn, J., Rother, C., & Criminisi, A. (2006). TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In European conference on computer vision.

    Google Scholar 

  • Sturgess, P., Alahari, K., Ladicky, L., & Torr, P. H. S. (2009). Combining appearance and structure from motion features for road scene understanding. In British machine vision conference.

    Google Scholar 

  • Taskar, B., Chatalbashev, V., & Koller, D. (2004). Learning associative Markov networks. In International conference on machine learning.

    Google Scholar 

  • Torr, P. H. S., & Murray, D. W. (1997). The development and comparison of robust methods for estimating the fundamental matrix. International Journal of Computer Vision.

  • Torralba, A., Murphy, K., & Freeman, W. (2004). Sharing features: efficient boosting procedures for multiclass object detection. In Conference on computer vision and pattern recognition.

    Google Scholar 

  • Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables. The Journal of Machine Learning Research.

  • Woodford, O., Torr, P. H. S., Reid, I., & Fitzgibbon, A. (2008). Global stereo reconstruction under second order smoothness priors. In Conference on computer vision and pattern recognition.

    Google Scholar 

  • Yotta (2011). Yotta DCL horizons. http://www.yottadcl.com/horizons/.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lubor Ladický.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ladický, L., Sturgess, P., Russell, C. et al. Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction. Int J Comput Vis 100, 122–133 (2012). https://doi.org/10.1007/s11263-011-0489-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-011-0489-0

Keywords

Navigation