MSFYOLO: Feature fusion-based detection for small objects
Keywords:
Object detection, Feature extraction network, Feature pyramid, Multi-scale feature fusionAbstract
At present, the effect of object detection algorithm in small object detection is very poor, mainly because the low-level network lacks semantic information and the characteristic information expressed by small object inspection data is very lack. In view of the above difficulties, this paper proposes a small object detection algorithm based on multi-scale feature fusion. By learning shallow features at the shallow level and deep features at the deep level, the proposed multi-scale feature learning scheme focuses on the fusion of concrete features and abstract features. It constructs object detector (MSFYOLO) based on multi-scale deep feature learning network and considers the relationship between a single object and local environment. Combining global information with local information, the feature pyramid is constructed by fusing different depth feature layers in the network. In addition, this paper also proposes a new feature extraction network (CourNet), through the way of feature visualization compared with the mainstream backbone network, the network can better express the small object feature information. The proposed algorithm is valuated on the MS COCO and achieved leading performance with 11.7% improvement in FPS, 17.0% improvement in AP, 81.0% improvement in ARS, and 23.3% reduction in computational FPLOs compared to YOLOv3. This study shows that the combination of global information and local information is helpful to detect the expression of small objects in different illumination. MSFYOLO uses CourNet as the backbone network, which has high efficiency and a good balance between accuracy and speed.
Downloads
References
S. Wu, Y. Xu, D. Zhao, “Overview of Object Detection Based on Deep Convolutional Networks,” Pattern Recognition and Artificial Intelligence, 2018: 335–346.
A. Krizhevsky, I. Sutskever, G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, 25, 1097–1105.
J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, L. Fei–Fei, “Imagenet: A large–scale hierarchical image database,” 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009: 248–255.
K. Simonyan, A. Zisserman, “Very deep convolutional networks for large–scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
C. Szegedy, W. Liu, Y.Jia, “Going deeper with convolutions,” Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 1–9.
K. He, X. Zhang, S. Ren, “Deep residual learning for image recognition,” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770–778.
C. Szegedy, V. Vanhoucke. S. Ioffe, “Rethinking the inception architecture for computer vision,” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 2818–2826.
G. Huang, Z. Liu, L. Van, “Densely connected convolutional networks,” Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 4700–4708.
J. Hu, L. Shen, G. Sun, “Squeeze–and–excitation networks,” Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132–7141.
R. Girshick, J. Donahue, T. Darrell, “Rich feature hierarchies for accurate object detection and semantic segmentation,” Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 580–587.
M. Everingham, L. Van, C. K. I. Williams, “The pascal visual object classes challenge,” International journal of computer vision, 2010, 88(2): 303–338.
P. F. Felzenszwalb, R. B. Girshick, D. McAllester, “Object detection with discriminatively trained part–based models,” IEEE transactions on pattern analysis and machine intelligence, 2009, 32(9): 1627–1645.
S. Wu, Y. Xu, D. Zhao, “Overview of object detection based on deep convolutional network,” Pattern Recognition and Artificial Intelligence, 2018 (04, 2018): 335–346.
Z. Zhang, X. Zhang, C. Peng, “Exfuse: Enhancing feature fusion for semantic segmentation,” Proceedings of the European Conference on Computer Vision. 2018: 269–284.
X. Pan, X. Zhang, W. Dong, H. Yao, C. Xu, “Research Status of Small Sample Object Detection,” Journal of Nanjing University of Information Science & Technology,2019,11(06):698–705.
J. Yuan, Y. Hu, Y. Sun, “Deep learning for small object detection,” Journal of Beijing University of Technology, 2021, 47(3): 293–302.
J. Wang, J. Zhang, J. Zhang, “A new method for image classification and object detection using convolutional neural networks,” Computer Engineering and Applications, 2017, 53(13): 34–41.
A. Bochkovskiy, C. Y. Wang, H. Y. M. Liao, “YOLOv4: Optimal speed and accuracy of object detection,” arXiv preprint arXiv:2004.10934, 2020.
K. He, X. Zhang, S. Ren, J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 37(9):1904–1916,2015.
S. Liu, L. Qi, H. Qin, “Path aggregation network for instance segmentation,” Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 8759–8768.
Z. Zheng, P. Wang, W. Liu, “Distance–IoU loss: Faster and better learning for bounding box regression,” Proceedings of the AAAI Conference on Artificial Intelligence. 2020, 34(07): 12993–13000.
T. Y. Lin, P. Dollár, R. Girshick, “Feature pyramid networks for object detection,” Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 2117–2125.
H. Rezatofighi, N. Tsoi, J. Y. Gwak, “Generalized intersection over union: A metric and a loss for bounding box regression,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 658–666.
S. H. Chen, C. C. Tsai, "SMD LED chips defect detection using a YOLOv3-dense model." Advanced Engineering Informatics, 2021,47 (5): 101255–101259.
B. Krawczyk, (2016). Learning from imbalanced data: open challenges and future directions. Progress in Artificial Intelligence, 5(4), pp. 221-232.
H. Lee, M. Park, & J. Kim. (2016, September). Plankton classification on imbalanced large scale database via convolutional neural networks with transfer learning. In 2016 IEEE International Conference on Image Processing (ICIP) (pp. 3713-3717). IEEE.
J. M. Johnson, & T. M. Khoshgoftaar, (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6(1), 27.
R. A. Bauder, & T. M. Khoshgoftaar, (2018). The effects of varying class distribution on learner behavior for medicare fraud detection with imbalanced big data. Health Information Science and Vystems, 6(1), p 9.
S. Wang, W. Liu, J. Wu, L. Cao, Q. Meng, and P. J. Kennedy,“Training deep neural networks on imbalanced data sets,” in Proceedings of the International Joint Conference on Neural Networks, 2016, doi: 10.1109/IJCNN.2016.7727770.
D. Verma, C. Bose, N. Tufchi, K. Pant, V. Tripathi, and A. Thapliyal, “An efficient framework for identification of Tuberculosis and Pneumonia in chest X-ray images using Neural Network,” in Procedia Computer Science, 2020, doi:10.1016/j.procs.2020.04.023.
H. Panwar, P. K. Gupta, M. K. Siddiqui, R. Morales-Menendez, and V. Singh, “Application of deep learning for fast detection ofCOVID-19 in X-Rays using nCOVnet,” Chaos, Solitons and Fractals, 2020, doi: 10.1016/j.chaos.2020.109944.
Y. Liang, G. Wang, W. Li, "A New Object Detection Method for Object Deviating from Center or Multi Object Crowding." Displays, 2021,69(9): 102042–102049.
E. Cortés and S. Sánchez, "Deep Learning Transfer with AlexNet for chest X-ray COVID-19 recognition," in IEEE Latin America Transactions, vol. 19, no. 6, pp. 944-951, June 2021, doi: 10.1109/TLA.2021.9451239.
O. L. V. de Sousa, D. M. V. Magalhães, P. de A. Vieira and R. Silva, "Deep Learning in Image Analysis for COVID-19 Diagnosis: a Survey," in IEEE Latin America Transactions, vol. 19, no. 6, pp. 925-936, June 2021, doi: 10.1109/TLA.2021.9451237.
M.Rostami, K.Berahmand, and S.Forouzandeh. "A novel method of constrained feature selection by the measurement of pairwise constraints uncertainty. " J Big Data 7 ,October 2020,doi: 10.1186/s40537-020-00352-3.
M.Rostami, K.Berahmand, and S.Forouzandeh. "A novel community detection based genetic algorithm for feature selection." J Big Data 8, January 2021.,doi:10.1186/s40537-020-00398-3.
M.Rostami, K.Berahmand, N.Elahe and S.Forouzandeh. "Review of swarm intelligence-based feature selection methods." Engineering Applications of Artificial Intelligence,April 2021 doi:10.1016/ j.engappai.2021.104210.
M.Rostami, S.Forouzandeh, K.Berahmand, M.Soltani. "Integration of multi-objective PSO based feature selection and node centrality for medical datasets." Genomics. November 2020;112(6):4370-4384. doi: 10.1016/j.ygeno.2020.07.027.
M.Rostami., K.Berahmand. & S.Forouzandeh. "A novel community detection based genetic algorithm for feature selection." J Big Data 8, 2 January 2021. doi:10.1186/s40537-020-00398-3.
M.Rostami., K.Berahmand. & S.Forouzandeh. "A novel method of constrained feature selection by the measurement of pairwise constraints uncertainty. " J Big Data 7, 83.October 2020.doi:10.1186/s40537-020-00352-3.