Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction

Ladický, Lubor; Sturgess, Paul; Russell, Chris; Sengupta, Sunando; Bastanlar, Yalin; Clocksin, William; Torr, Philip H. S.

doi:10.1007/s11263-011-0489-0

Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction

Published: 07 September 2011

Volume 100, pages 122–133, (2012)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Lubor Ladický¹,
Paul Sturgess²,
Chris Russell³,
Sunando Sengupta²,
Yalin Bastanlar⁴,
William Clocksin⁵ &
…
Philip H. S. Torr²

1857 Accesses
77 Citations
3 Altmetric
Explore all metrics

Abstract

The problems of dense stereo reconstruction and object class segmentation can both be formulated as Random Field labeling problems, in which every pixel in the image is assigned a label corresponding to either its disparity, or an object class such as road or building. While these two problems are mutually informative, no attempt has been made to jointly optimize their labelings. In this work we provide a flexible framework configured via cross-validation that unifies the two problems and demonstrate that, by resolving ambiguities, which would be present in real world data if the two problems were considered separately, joint optimization of the two problems substantially improves performance. To evaluate our method, we augment the Leuven data set (http://cms.brookes.ac.uk/research/visiongroup/files/Leuven.zip), which is a stereo video shot from a car driving around the streets of Leuven, with 70 hand labeled object class and disparity maps. We hope that the release of these annotations will stimulate further work in the challenging domain of street-view analysis. Complete source code is publicly available (http://cms.brookes.ac.uk/staff/Philip-Torr/ale.htm).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-Modal Distillation

Learning Where to Classify in Multi-view Semantic Segmentation

Object-Level Priors for Stixel Generation

References

Alahari, K., Russell, C., & Torr, P. H. S. (2010). Efficient piecewise learning for conditional random fields. In Conference on computer vision and pattern recognition.
Google Scholar
Batra, D., Sukthankar, R., & Tsuhan, C. (2008). Learning class-specific affinities for image labelling. In Conference on computer vision and pattern recognition.
Google Scholar
Bleyer, M., Rother, C., Kohli, P., Scharstein, D., & Sinha, S. (2011). Object stereo—joint stereo matching and object segmentation. In Conference on computer vision and pattern recognition.
Google Scholar
Boykov, Y., & Jolly, M. (2001). Interactive graph cuts for optimal boundary and region segmentation of objects in N-D images. In International conference on computer vision.
Google Scholar
Boykov, Y., & Kolmogorov, V. (2004). An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. Transactions on Pattern Analysis and Machine Intelligence.
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. Transactions on Pattern Analysis and Machine Intelligence.
Brostow, G. J., Shotton, J., Fauqueur, J., & Cipolla, R. (2008). Segmentation and recognition using structure from motion point clouds. In European conference on computer vision.
Google Scholar
Comaniciu, D., & Meer, P. (2002). Mean shift: a robust approach toward feature space analysis. Transactions on Pattern Analysis and Machine Intelligence.
Dick, A. R., Torr, P. H. S., & Cipolla, R. (2004). Modelling and interpretation of architecture from several images. International Journal of Computer Vision.
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM.
Gould, S., Fulton, R., & Koller, D. (2009). Decomposing a scene into geometric and semantically consistent regions. In International conference on computer vision.
Google Scholar
Hoiem, D., Efros, A., & Hebert, M. (2005) Automatic photo pop-up. ACM Transactions on Graphics.
Hoiem, D., Efros, A., & Hebert, M. (2006). Putting objects in perspective. In Conference on computer vision and pattern recognition.
Google Scholar
Hoiem, D., Rother, C., & Winn, J. M. (2007). 3D layout CRF for multi-view object class recognition and segmentation. In Conference on computer vision and pattern recognition.
Google Scholar
Kohli, P., Kumar, M., & Torr, P. H. S. (2007). P ³ and beyond: solving energies with higher order cliques. In Conference on computer vision and pattern recognition.
Google Scholar
Kohli, P., Ladicky, L., & Torr, P. H. S. (2008). Robust higher order potentials for enforcing label consistency. In Conference on computer vision and pattern recognition.
Google Scholar
Kolmogorov, V., & Zabih, R. (2001). Computing visual correspondence with occlusions via graph cuts. In ICCV.
Google Scholar
Kumar, M. P., Veksler, O., & Torr, P. H. S. (2011). Improved moves for truncated convex models. Journal of Machine Learning Research.
Ladicky, L., Russell, C., Kohli, P., & Torr, P. H. S. (2009). Associative hierarchical CRFs for object class image segmentation. In International conference on computer vision.
Google Scholar
Leibe, B., Cornelis, N., Cornelis, K., & Gool, L. V. (2007). Dynamic 3D scene analysis from a moving vehicle. In Conference on computer vision and pattern recognition.
Google Scholar
Liu, B., Gould, S., & Koller, D. (2010). Single image depth estimation from predicted semantic labels. In Conference on computer vision and pattern recognition.
Google Scholar
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision.
Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., & Belongie, S. (2007). Objects in context. In International conference on computer vision.
Google Scholar
Ramalingam, S., Kohli, P., Alahari, K., & Torr, P. H. S. (2008). Exact inference in multi-label CRFs with higher order cliques. In Conference on computer vision and pattern recognition.
Google Scholar
Rother, C., Kolmogorov, V., & Blake, A. (2004). Grabcut: interactive foreground extraction using iterated graph cuts. In SIGGRAPH.
Google Scholar
Russell, C., Ladicky, L., Kohli, P., & Torr, P. H. S. (2010). Exact and approximate inference in associative hierarchical networks using graph cuts. Uncertainty in Artificial Intelligence.
Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. International Journal of Computer Vision.
Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. Transactions on Pattern Analysis and Machine Intelligence.
Shotton, J., Winn, J., Rother, C., & Criminisi, A. (2006). TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In European conference on computer vision.
Google Scholar
Sturgess, P., Alahari, K., Ladicky, L., & Torr, P. H. S. (2009). Combining appearance and structure from motion features for road scene understanding. In British machine vision conference.
Google Scholar
Taskar, B., Chatalbashev, V., & Koller, D. (2004). Learning associative Markov networks. In International conference on machine learning.
Google Scholar
Torr, P. H. S., & Murray, D. W. (1997). The development and comparison of robust methods for estimating the fundamental matrix. International Journal of Computer Vision.
Torralba, A., Murphy, K., & Freeman, W. (2004). Sharing features: efficient boosting procedures for multiclass object detection. In Conference on computer vision and pattern recognition.
Google Scholar
Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables. The Journal of Machine Learning Research.
Woodford, O., Torr, P. H. S., Reid, I., & Fitzgibbon, A. (2008). Global stereo reconstruction under second order smoothness priors. In Conference on computer vision and pattern recognition.
Google Scholar
Yotta (2011). Yotta DCL horizons. http://www.yottadcl.com/horizons/.

Download references

Author information

Authors and Affiliations

University of Oxford, Oxford, UK
Lubor Ladický
Oxford Brookes University, Oxford, UK
Paul Sturgess, Sunando Sengupta & Philip H. S. Torr
Queen Mary College, University of London, London, UK
Chris Russell
Izmir Institute of Technology, Izmir, Turkey
Yalin Bastanlar
University of Hertfordshire, Hatfield, UK
William Clocksin

Authors

Lubor Ladický
View author publications
You can also search for this author in PubMed Google Scholar
Paul Sturgess
View author publications
You can also search for this author in PubMed Google Scholar
Chris Russell
View author publications
You can also search for this author in PubMed Google Scholar
Sunando Sengupta
View author publications
You can also search for this author in PubMed Google Scholar
Yalin Bastanlar
View author publications
You can also search for this author in PubMed Google Scholar
William Clocksin
View author publications
You can also search for this author in PubMed Google Scholar
Philip H. S. Torr
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lubor Ladický.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ladický, L., Sturgess, P., Russell, C. et al. Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction. Int J Comput Vis 100, 122–133 (2012). https://doi.org/10.1007/s11263-011-0489-0

Download citation

Received: 22 December 2010
Accepted: 01 August 2011
Published: 07 September 2011
Issue Date: November 2012
DOI: https://doi.org/10.1007/s11263-011-0489-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction

Abstract

Access this article

Similar content being viewed by others

Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-Modal Distillation

Learning Where to Classify in Multi-view Semantic Segmentation

Object-Level Priors for Stixel Generation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Joint Optimization for Object Class Segmentation and Dense Stereo Reconstruction

Abstract

Access this article

Similar content being viewed by others

Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-Modal Distillation

Learning Where to Classify in Multi-view Semantic Segmentation

Object-Level Priors for Stixel Generation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation