A Technique to Generate Depth Maps from Real Scenes without Manual Calibration

Authors

Keywords:

stereo vision, disparity map, calibration, visual impairment, blindness

Abstract

This paper proposes a technique for the generation of a disparity map from a real scene, captured by a stereo vision system. The underlying motivation for this work is to develop a system not requiring the use of a calibration pattern, which usually involves manual intervention. This is a well desired feature to allow its use in the design of aid devices for people with severe visual impairment or blindness. Experimental results showed that the developed technique has a level of effectiveness similar to other two well established techniques found in the literature, making it a promising alternative to be employed in situations where the calibration step becomes a burden to the user.

Downloads

Download data is not yet available.

Author Biographies

Ricardo S. Casado, ufscar

Ricardo S. Casado holds a degree in Computer Engineering from Centro Universitário de Votuporanga (2004). Master degree in Electrical Engineering with a focus on Signal Processing and Instrumentation, specializing in Computer Vision, at the University of São Paulo, São Carlos campus (EESC/USP). PhD student in Computer Science, Department at Federal University of São Carlos. Professor in Federal Institute of São Paulo (Votuporanga-SP). Experience in Camera Calibration with Genetic Programming and Depth Estimation using Deep Learning and Computer Vision. (e-mail: rscasado@ifsp.edu.br)

Carlos W. Carvalho, ufscar

Carlos W. Carvalho holds a Bachelor's degree in Computer Science from Universidade Estadual Paulista - UNESP (2013), and a Master's degree in Computer Science from Federal University of São Carlos - UFSCar (2017), with emphasis on Digital Image Processing, Stereo Vision, and Computer Graphics. (e-mail: carloswilldecarvalho@outlook.com)

Marcio M. Fernandes, ufscar

Marcio M. Fernandes has a full degree in Computer Science: University of São Paulo (undergraduate, 1989), Federal University of São Carlos (master’s, 1993), and The University of Edinburgh, UK (PhD, 1999). Has been working as a professor for undergraduate and postgraduate courses at UFSCar since 2008, participating in research projects that have resulted in the publication of several scientific articles. (e-mail: marcio@dc.ufscar.br)

Emerson C. Pedrino, ufscar

Emerson C. Pedrino is an associate Professor (Dr.) at the Department of Computer Science, Federal University of São Carlos. Holds a degree in Electrical Engineering from the University of São Paulo and a Bachelor’s degree in Computational Physics, also from the University of São Paulo - EESC (2016) and IFSC (2000), a Master’s degree in Electrical Engineering from the University of São Paulo - EESC (2003), and a Ph.D. in Electrical Engineering from the University of São Paulo - EESC (2008). Additionally, completed a Post-doctorate in Electronic Engineering (as a Visiting Professor) at the Department of Electronic Engineering, University of York, England, with research funding provided by FAPESP (2018-2019), and continues to collaborate in the development of hardware applications involving intelligent systems. (e-mail: emerson@dc.ufscar.br)

References

Z. Zhang, R. Deriche, O. Faugeras, and Q.-T. Luong, “A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry,” Artificial intelligence, vol. 78, no. 1-2, pp. 87–119, 1995.

Z. Zhang, “A flexible new technique for camera calibration,” IEEE Transactions on pattern analysis and machine intelligence, vol. 22,

no. 11, pp. 1330–1334, 2000.

S. Trejo, K. Martinez, and G. Flores, “Depth map estimation methodology for detecting free-obstacle navigation areas,” in 2019 International Conference on Unmanned Aircraft Systems (ICUAS), pp. 916–922, IEEE, 2019.

R. Peng, R. Wang, Z. Wang, Y. Lai, and R. Wang, “Rethinking depth estimation for multi-view stereo: A unified representation,” in

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8645–8654, 2022.

M. Cui, Y. Zhu, Y. Liu, Y. Liu, G. Chen, and K. Huang, “Dense depthmap estimation based on fusion of event camera and sparse lidar,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–11, 2022.

M. Beshley, P. Volodymyr, H. Beshley, and M. Gregus Jr, “A smartphone-based computer vision assistance system with neural network depth estimation for the visually impaired,” in International

Conference on Artificial Intelligence and Soft Computing, pp. 26–36, Springer, 2023.

X. Gui and X. Zhang, “An efficient dense depth map estimation algorithm using direct stereo matching for ultra-wide-angle images,” in

Computer Graphics International Conference, CGI 2022, Virtual Event, September 12–16, 2022, Proceedings, pp. 117–128, Springer, 2023.

B. Sae-jia, R. L. Paderon, and T. Srimuninnimit, “A head-mounted assistive device for visually impaired people with warning system from object detection and depth estimation,” in Journal of Physics:

Conference Series, vol. 2550, p. 012034, IOP Publishing, 2023.

C. W. de Carvalho, “Uma metodologia automática para geração de mapas de disparidades de ambientes reais,” Master’s Thesis

(in portuguese), Universidade Federal de São Carlos, Brazil, 2017. https://repositorio.ufscar.br/handle/ufscar/9692.

M. M. Valipoor and A. De Antonio, “Recent trends in computer visiondriven scene understanding for vi/blind users: a systematic mapping,” Universal Access in the Information Society, vol. 22, no. 3, pp. 983–1005, 2023.

R. Hartley and A. Zisserman, Multiple view geometry in computer vision. Cambridge university press, 2003.

S. Leutenegger, M. Chli, and R. Y. Siegwart, “Brisk: Binary robust invariant scalable keypoints,” in 2011 International conference on computer vision, pp. 2548–2555, Ieee, 2011.

H. Bay, T. Tuytelaars, and L. Van Gool, “Surf: Speeded up robust features,” in European conference on computer vision, pp. 404–417,

Springer, 2006.

C. Harris, M. Stephens, et al., “A combined corner and edge detector,” in Processding of the 4th Alvey vision conference, pp. 147–151, 1988.

J. Shi et al., “Good features to track,” in 1994 Proceedings of IEEE conference on computer vision and pattern recognition, pp. 593–600,

IEEE, 1994.

E. Rosten and T. Drummond, “Fusing points and lines for high performance tracking,” in Tenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1, vol. 2, pp. 1508–1515, Ieee, 2005.

E. Rosten and T. Drummond, “Machine learning for high-speed corner detection,” in European conference on computer vision, pp. 430–443, Springer, 2006.

M. Donoser and H. Bischof, “Efficient maximally stable extremal region (mser) tracking,” in 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol. 1, pp. 553–560, Ieee, 2006.

N. Roma, J. Santos-Victor, and J. Tomé, “A comparative analysis of cross-correlation matching algorithms using a pyramidal resolution approach,” in Empirical Evaluation Methods in Computer Vision, pp. 117– 142, World Scientific, 2002.

P. J. Rousseeuw, “Least median of squares regression,” Journal of the American statistical association, vol. 79, no. 388, pp. 871–880, 1984.

M. A. Fischler and R. C. Bolles, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Communications of the ACM, vol. 24, no. 6, pp. 381–395, 1981.

P. H. Torr and D. W. Murray, “The development and comparison of robust methods for estimating the fundamental matrix,” International

journal of computer vision, vol. 24, no. 3, pp. 271–300, 1997.

A. Fusiello, E. Trucco, and A. Verri, “A compact algorithm for rectification of stereo pairs,” Machine vision and applications, vol. 12, no. 1,

pp. 16–22, 2000.

R. C. Gonzalez and R. E. Woods, “Image processing,” Digital image processing, vol. 2, no. 1, 2007.

H. Hirschmuller, “Accurate and efficient stereo processing by semiglobal matching and mutual information,” in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 2, pp. 807–814, IEEE, 2005.

M. College, “Middlebury stereo evaluation.” Available in: https://vision.middlebury.edu/stereo/eval3/, 2016. Access date: Jan 2023.

D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International journal of computer vision, vol. 47, no. 1, pp. 7–42, 2002.

D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Nešic,´ X. Wang, and P. Westling, “High-resolution stereo datasets with

subpixel-accurate ground truth,” in German conference on pattern recognition, pp. 31–42, Springer, 2014.

Published

2024-04-13

How to Cite

Casado, R. S., Carvalho, C. W., Fernandes, M. M., & Pedrino, E. C. (2024). A Technique to Generate Depth Maps from Real Scenes without Manual Calibration. IEEE Latin America Transactions, 22(5), 387–393. Retrieved from https://latamt.ieeer9.org/index.php/transactions/article/view/8653