SBIN: A stereo disparity estimation network using binary convolutions
Keywords:stereo vision, Computer vision, Embedded devices, Binary networks
Although the current advances on convolutional networks are outstanding, they mainly depend on extensive computational power, limiting the areas of applications. The latter applies for stereo disparity estimation, where current solutions can barely run on embedded devices. This work shows that it is possible to binarize an end-to-end stereo disparity network, which can be considered a step towards lightweight and potentially faster disparity estimation networks. This work shows the validity of the proposed approach through experimentation in two well-known datasets, sceneflow and kitti2012. The results show that a binary disparity model is possible but at the cost of performance. An EPE of 5.14 and 2.09 is achieved in sceneflow and kitti2012 accordingly.
S. Hong, M. Li, M. Liao, and P. van Beek, “Real-time mobile robot navigation based on stereo vision and low-cost gps,” Electronic Imaging, vol. 2017, no. 9, pp. 10–15, 2017.
H. Wu, H. Su, Y. Liu, and H. Gao, “Object detection and localization using stereo cameras,” in 2020 5th International Conference on Advan- ced Robotics and Mechatronics (ICARM), pp. 628–633, 2020.
D. Scharstein and R. Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms,” International Journal of Computer Vision, vol. 47, pp. 7–42, 2001.
Y.Zhang,Y.Chen,X.Bai,J.Zhou,K.Yu,Z.Li,andK.Yang,“Adaptive unimodal cost volume filtering for deep stereo matching,” arXiv preprint arXiv:1909.03751, 2019.
M. Pietron and M. Wielgosz, “Retrain or not retrain? - efficient pruning methods of deep cnn networks,” in Computational Science – ICCS 2020 (V. V. Krzhizhanovskaya, G. Závodszky, M. H. Lees, J. J. Dongarra, P. M. A. Sloot, S. Brissos, and J. Teixeira, eds.), (Cham), pp. 452–463, Springer International Publishing, 2020.
M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “Xnor-net: Imagenet classification using binary convolutional neural networks,” CoRR, vol. abs/1603.05279, 2016.
G. Chen, H. Meng, Y. Liang, and K. Huang, “Gpu-accelerated real-time stereo estimation with binary neural network,” IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 12, pp. 2896–2907, 2020.
G. Chen, Y. Ling, T. He, H. Meng, S. He, Y. Zhang, and K. Huang, “Stereoengine: An fpga-based accelerator for real-time high-quality stereo estimation with binary neural network,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 11, pp. 4179–4190, 2020.
W. Luo, A. G. Schwing, and R. Urtasun, “Efficient deep learning for stereo matching,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5695–5703, June 2016.
A. Kendall, H. Martirosyan, S. Dasgupta, P. Henry, R. Kennedy, A. Ba- chrach, and A. Bry, “End-to-end learning of geometry and context for deep stereo regression,” CoRR, vol. abs/1703.04309, 2017.
J. Zbontar and Y. LeCun, “Stereo matching by training a convolutional neural network to compare image patches,” CoRR, vol. abs/1510.05970, 2015.
Y. Wang, Z. Lai, G. Huang, B. H. Wang, L. Van Der Maaten, M. Camp- bell, and K. Q. Weinberger, “Anytime stereo image depth estimation on mobile devices,” arXiv preprint arXiv:1810.11408, 2018.
C. A. Aguilera, C. Aguilera, C. A. Navarro, and A. D. Sappa, “Fast CNN stereo depth estimation through embedded GPU devices,” Sensors, vol. 20, p. 3249, jun 2020.
M. Courbariaux and Y. Bengio, “Binarynet: Training deep neural net- works with weights and activations constrained to +1 or -1,” CoRR, vol. abs/1602.02830, 2016.
M. Courbariaux, Y. Bengio, and J. David, “Binaryconnect: Training deep neural networks with binary weights during propagations,” CoRR, vol. abs/1511.00363, 2015.
W. Tang, G. Hua, and L. Wang, “How to train a compact binary neural network with high accuracy?,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017.
Z. Liu, B. Wu, W. Luo, X. Yang, W. Liu, and K. Cheng, “Bi-real net: Enhancing the performance of 1-bit cnns with improved representational capability and advanced training algorithm,” CoRR, vol. abs/1808.00278, 2018.
X. Lin, C. Zhao, and W. Pan, “Towards accurate binary convolutional neural network,” in Advances in Neural Information Processing Systems 30 (I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds.), pp. 345–353, Curran Associates, Inc., 2017.
J. Bethge, M. Bornstein, A. Loy, H. Yang, and C. Meinel, “Trai- ning competitive binary neural networks from scratch,” CoRR, vol. abs/1812.01965, 2018.
H. Wang, Y. Xu, B. Ni, L. Zhuang, and H. Xu, “Flexible network binarization with layer-wise priority,” in 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 2346–2350, Oct 2018.
J. Xu, P. Wang, H. Yang, and A. M. López, “Training a binary weight object detector by knowledge transfer for autonomous driving,” CoRR, vol. abs/1804.06332, 2018.
O.Ronneberger,P.Fischer,andT.Brox,“U-net:Convolutionalnetworks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (N. Navab, J. Horneg- ger, W. M. Wells, and A. F. Frangi, eds.), (Cham), pp. 234–241, Springer International Publishing, 2015.
T. Simons and D.-J. Lee, “A review of binarized neural networks,” Electronics, vol. 8, p. 661, Jun 2019.
V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in ICML, 2010.
K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Sur- passing human-level performance on imagenet classification,” in 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1026– 1034, 2015.
N. Mayer, E. Ilg, P. Hausser, P. Fischer, D. Cremers, A. Dosovitskiy, and T. Brox, “A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2016.
A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? the kitti vision benchmark suite,” in Conference on Computer Vision and Pattern Recognition (CVPR), 2012.
D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014.
W. e. a. Falcon, “Pytorch lightning,” 2019.