Jaccard distance as similarity measure for disparity map estimation
Keywords:
Stereoscopic vision, disparity mapping, Jaccard, image processingAbstract
High confidence in disparity map estimation is critical in several application fields. A novel framework that employs customized local binary patterns and Jaccard distance for stereo matching along stereo consistency checks is presented. The proposal contributes with a method that allows greater confidence in its estimates, without dependence on supervised learning, and capable of generating a dense map with low-cost filtering. The proposed framework has been implemented in CPU and GPU for parallel processing capability. First, Local binary patterns are obtained during the initial stage; then, the Jaccard distance is employed as a similarity measure in the stereo matching stage; subsequently, a matching consistency check is performed, and singular disparities are removed. A comparison among novel and state-of-the-art algorithms for sparse disparity map estimation is performed employing Middlebury and KITTI stereo Datasets where the quality criteria used were percentage of bad pixels (B), quantity of invalid pixels, processing time and running environments to put each framework into context, obtaining down to 2.07% bad matching pixels and performing better than state-of-the-art cost functions
Downloads
References
D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense
two-frame stereo correspondence algorithms,” International Journal of
Computer Vision, vol. 47, no. 1, pp. 7–42, 2002.
S. Trejo, K. Martinez, and G. Flores, “Depth map estimation methodology
for detecting free-obstacle navigation areas,” in 2019 International
Conference on Unmanned Aircraft Systems (ICUAS), pp. 916–922, IEEE,
J.-N. Zhang, Q.-X. Su, P.-Y. Liu, H.-Y. Ge, and Z.-F. Zhang, “Mudeepnet:
Unsupervised learning of dense depth, optical flow and camera pose
using multi-view consistency loss,” International Journal of Control,
Automation and Systems, vol. 17, no. 10, pp. 2586–2596, 2019.
K. Zhou, X. Meng, and B. Cheng, “Review of stereo matching algorithms
based on deep learning,” Computational Intelligence and
Neuroscience, vol. 2020, 2020.
R. Fan, X. Ai, and N. Dahnoun, “Road surface 3d reconstruction based
on dense subpixel disparity map estimation,” IEEE Transactions on
Image Processing, vol. 27, no. 6, pp. 3025–3035, 2018.
R. A. Hamzah, A. F. Kadmin, M. S. Hamid, S. F. A. Ghani, and
H. Ibrahim, “Improvement of stereo matching algorithm for 3d surface
reconstruction,” Signal Processing: Image Communication, vol. 65,
pp. 165–172, 2018.
J. H. Jung, S. Heo, and C. G. Park, “Patch-based stereo direct visual
odometry robust to illumination changes,” International Journal of
Control, Automation and Systems, vol. 17, no. 3, pp. 743–751, 2019.
T. S. Sheikh and I. M. Afanasyev, “Stereo vision-based optimal path
planning with stochastic maps for mobile robot navigation,” in International
Conference on Intelligent Autonomous Systems, pp. 40–55,
Springer, 2018.
B.-S. Shin, X. Mou, W. Mou, and H. Wang, “Vision-based navigation of an unmanned surface vehicle with object detection and tracking
abilities,” Machine Vision and Applications, vol. 29, no. 1, pp. 95–112,
L. Ting and D. Yuelin, “A novel method of human tracking based on
stereo vision,” in 2018 5th IEEE International Conference on Cloud
Computing and Intelligence Systems (CCIS), pp. 883–889, IEEE, 2018.
K. Batsos and P. Mordohai, “Recresnet: A recurrent residual cnn
architecture for disparity map enhancement,” in 2018 International
Conference on 3D Vision (3DV), pp. 238–247, IEEE, 2018.
F. Cheng, X. He, and H. Zhang, “Learning to refine depth for robust
stereo estimation,” Pattern Recognition, vol. 74, pp. 122–133, 2018.
S. J. Lee, H. Choi, and S. S. Hwang, “Real-time depth estimation using
recurrent cnn with sparse depth cues for slam system,” International
Journal of Control, Automation and Systems, vol. 18, no. 1, pp. 206–
, 2020.
C. Lin, Y. Li, G. Xu, and Y. Cao, “Optimizing ZNCC calculation in
binocular stereo matching,” Signal Processing: Image Communication,
vol. 52, pp. 64–73, 2017.
J.-I. Kang and S.-W. Lee, “A light-weight stereo matching network
for an embedded vision system,” in 2020 International Conference
on Information and Communication Technology Convergence (ICTC),
pp. 1234–1237, IEEE, 2020.
S. Perri, F. Frustaci, F. Spagnolo, and P. Corsonello, “Design of realtime
fpga-based embedded system for stereo vision,” in 2018 IEEE
International Symposium on Circuits and Systems (ISCAS), pp. 1–5,
IEEE, 2018.
H. Hirschmuller, “Stereo processing by semiglobal matching and mutual
information,” IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 30, no. 2, pp. 328–341, 2007.
J. Valentin, A. Kowdle, J. T. Barron, N. Wadhwa, M. Dzitsiuk,
M. Schoenberg, V. Verma, A. Csaszar, E. Turner, I. Dryanovski, et al.,
“Depth from motion for smartphone ar,” ACM Transactions on Graphics
(ToG), vol. 37, no. 6, pp. 1–19, 2018.
Y. Zhong, C. Loop, W. Byeon, S. Birchfield, Y. Dai, K. Zhang,
A. Kamenev, T. Breuel, H. Li, and J. Kautz, “Displacement-invariant
cost computation for efficient stereo matching,” arXiv preprint
arXiv:2012.00899, 2020.
W. Mao, M. Wang, J. Zhou, and M. Gong, “Semi-dense stereo matching
using dual cnns,” in 2019 IEEE Winter Conference on Applications of
Computer Vision (WACV), pp. 1588–1597, IEEE, 2019.
J. Zbontar, Y. LeCun, et al., “Stereo matching by training a convolutional
neural network to compare image patches.,” Journal of Machine
Learning Research, vol. 17, no. 1, pp. 2287–2318, 2016.
J. Navarro and A. Buades, “Semi-dense and robust image registration by
shift adapted weighted aggregation and variational completion,” Image
and Vision Computing, vol. 89, pp. 258–275, 2019.
L. Keselman, J. Iselin Woodfill, A. Grunnet-Jepsen, and A. Bhowmik,
“Intel realsense stereoscopic depth cameras,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops,
pp. 1–10, 2017.
R. Zabih and J. Woodfill, “Non-parametric local transforms for computing
visual correspondence,” in European conference on computer vision,
pp. 151–158, Springer, 1994.
F. Guney and A. Geiger, “Displets: Resolving stereo ambiguities using
object knowledge,” in Proceedings of the IEEE Conference on Computer
Vision and Pattern Recognition, pp. 4165–4175, 2015.
M. S. Hamid, N. Abd Manap, R. A. Hamzah, and A. F. Kadmin, “Stereo
matching algorithm based on deep learning: A survey,” Journal of King
Saud University-Computer and Information Sciences, 2020.
K. Yamaguchi, D. McAllester, and R. Urtasun, “Robust monocular
epipolar flow estimation,” in Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, pp. 1862–1869, 2013.
A. Seki and M. Pollefeys, “SGM-nets: Semi-global matching with neural
networks,” in Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition, pp. 231–240, 2017.
X. Cheng, P. Wang, and R. Yang, “Learning depth with convolutional
spatial propagation network,” IEEE Transactions on Pattern Analysis
and Machine Intelligence, vol. 42, no. 10, pp. 2361–2379, 2019.
W. Ende, Z. Yalong, P. Liangyu, L. Yijun, and W. Tianyao, “Stereo
matching algorithm based on the combination of matching costs,” in
IEEE 7th Annual International Conference on CYBER Technology
in Automation, Control, and Intelligent Systems (CYBER), pp. 1001–
, IEEE, 2017.
V. Gonzalez-Huitron, V. Ponomaryov, E. Ramos-Diaz, and S. Sadovnychiy,
“Parallel framework for dense disparity map estimation using
hamming distance,” Signal, Image and Video Processing, vol. 12, no. 2,
pp. 231–238, 2018.
V. Kravchenko, V. Ponomaryov, V. Pustovoit, and D. Rosas-Miranda,
“Depth map reconstruction based on features formed by descriptor of
stereo color pairs,” in Doklady Mathematics, vol. 100, pp. 396–400,
Springer, 2019.
M. Rahman, S. Rahman, M. Shoyaib, et al., “MCCT: a multi-channel
complementary census transform for image classification,” Signal, Image
and Video Processing, vol. 12, no. 2, pp. 281–289, 2018.
C. Singh, E. Walia, and K. P. Kaur, “Color texture description with novel
local binary patterns for effective image retrieval,” Pattern recognition,
vol. 76, pp. 50–68, 2018.
C. Ahlberg, M. León, F. Ekstrand, and M. Ekström, “The genetic
algorithm census transform: evaluation of census windows of different
size and level of sparseness through hardware in-the-loop training,”
Journal of Real-Time Image Processing, vol. 18, no. 3, pp. 539–559,
S. Kosub, “A note on the triangle inequality for the jaccard distance,”
Pattern Recognition Letters, vol. 120, pp. 36–38, 2019.
V. Verma and R. K. Aggarwal, “A comparative analysis of similarity
measures akin to the Jaccard index in collaborative recommendations:
empirical and theoretical perspective,” Social Network Analysis and
Mining, vol. 10, no. 1, pp. 1–16, 2020.
D. Scharstein, H. Hirschmüller, Y. Kitajima, G. Krathwohl, N. Neši´c,
X. Wang, and P. Westling, “High-resolution stereo datasets with
subpixel-accurate ground truth,” in German Conference on Pattern
Recognition, pp. 31–42, Springer, 2014.
M. Menze, C. Heipke, and A. Geiger, “Joint 3D estimation of vehicles
and scene flow,” ISPRS Annals of the Photogrammetry, Remote Sensing
and Spatial Information Sciences, vol. 2, p. 427, 2015.