Semantic segmentation of track image based on deep neural network

doi:10.19682/j.cnki.1005-8885.2020.0023

中国邮电高校学报(英文) ›› 2020, Vol. 27 ›› Issue (5): 23-33.doi: 10.19682/j.cnki.1005-8885.2020.0023

• Artificial Intelligence • 上一篇下一篇

Semantic segmentation of track image based on deep neural network

汪昭颖; 周军华; 廖中华; 翟翔; 张连平

北京邮电大学

收稿日期:2019-11-29 修回日期:2020-04-10 出版日期:2020-10-22 发布日期:2020-10-23
通讯作者: 汪昭颖 E-mail:wangzhaoying@bupt.edu.cn
基金资助:
国家自然科学基金;国家重点研发计划“政府间国际科技创新合作“重点专项

Semantic segmentation of track image based on deep neural network

Wang Zhaoying, Zhou Junhua, Liao Zhonghua, Zhai Xiang, Zhang Lianping

Beijing University of Posts and Telecommunications Beijing Simulation Center Beijing Institute of Electronic System Engineering Alibaba Cloud Computing

Received:2019-11-29 Revised:2020-04-10 Online:2020-10-22 Published:2020-10-23
Contact: ZhaoYing WANG E-mail:wangzhaoying@bupt.edu.cn
Supported by:
the Key Special Project in Intergovernmental International Scientific and Technological Innovation Cooperation of the National Key Research and Development Program of China

摘要/Abstract

摘要： In this paper, deep learning technology was utilited to solve the railway track recognition in intrusion detection problem. The railway track recognition can be viewed as semantic segmentation task which extends image processing to pixel level prediction. An encoder-decoder architecture DeepLabv3 + model was applied in this work due to its good performance in semantic segmentation task. Since images of the railway track collected from the video surveillance of the train cab were used as experiment dataset in this work, the following improvements were made to the model. The first aspect deals with over-fitting problem due to the limited amount of training data. Data augmentation and transfer learning are applied consequently to rich the diversity of data and enhance model robustness during the training process. Besides, different gradient descent methods are compared to obtain the optimal optimizer for training model parameters. The third problem relates to data sample imbalance, cross entropy (CE) loss is replaced by focal loss (FL) to address the issue of serious imbalance between positive and negative sample. Effectiveness of the improved DeepLabv3 + model with above solutions is demonstrated by experiment results with different system parameters.

关键词: railway track recognition, convolutional neural networks, semantic segmentation, DeepLabv3 +

Abstract: In this paper, deep learning technology was utilited to solve the railway track recognition in intrusion detection problem. The railway track recognition can be viewed as semantic segmentation task which extends image processing to pixel level prediction. An encoder-decoder architecture DeepLabv3 + model was applied in this work due to its good performance in semantic segmentation task. Since images of the railway track collected from the video surveillance of the train cab were used as experiment dataset in this work, the following improvements were made to the model. The first aspect deals with over-fitting problem due to the limited amount of training data. Data augmentation and transfer learning are applied consequently to rich the diversity of data and enhance model robustness during the training process. Besides, different gradient descent methods are compared to obtain the optimal optimizer for training model parameters. The third problem relates to data sample imbalance, cross entropy (CE) loss is replaced by focal loss (FL) to address the issue of serious imbalance between positive and negative sample. Effectiveness of the improved DeepLabv3 + model with above solutions is demonstrated by experiment results with different system parameters.

Key words: railway track recognition, convolutional neural networks, semantic segmentation, DeepLabv3 +

中图分类号:

TP391.4

Wang Zhaoying, Zhou Junhua, Liao Zhonghua, Zhai Xiang, Zhang Lianping. Semantic segmentation of track image based on deep neural network[J]. The Journal of China Universities of Posts and Telecommunications, 2020, 27(5): 23-33.

参考文献 17

1.	Wang Y, Yu Z J, Zhu L Q, et al. Fast feature extraction algorithm for high-speed railway clearance intruding objects based on CNN. Chinese Journal of Scientific Instrument, 2017, 38(5):1267-1275 (in Chinese)
2.	Wang Q X, Liang X F, Liu Y L, et al. Railway rail identification detection method using machine vision. Journal of Central South University: Science and Technology, 2014, 45(7): 2496-2502 (in Chinese)
3.	Telke C, Beitelschmidt M. Edge detection based on fractional order differentiation and its application to railway track images. PAMM (Proceedings in Applied Mathematics and Mechanics), 2015, 15: 671-672.
4.	Wang Z G, Shu G H. Research on track section identification based on traditional image processing algorithm and deep learning. Electrical Automation, 2019, 41(4): 111-114 (in Chinese)
5.	Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'15), 2015, Jun 7-12, Boston, MA, USA. Piscataway, NJ, USA: IEEE, 2015: 3431-3440
6.	Badrinarayanan V, Kendall A, Cipolla R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495
7.	Chen L C, Papandreou G, Kokkinos I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs. arXiv preprint, arXiv:1412.7062, 2014
8.	Chen L C, Zhu Y K, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European Conference on Computer Vision (ECCV’18), 2018, Sept 8-14, Munich, Germany. 2018: 801-818.
9.	Chen L C, Papandreou G, Kokkinos I, et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 834-848
10	Chen L C, Papandreou G, Schroff F, et al. Rethinking atrous convolution for semantic image segmentation. arXiv preprint, arXiv:1706.05587, 2017
11	Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 2012, 25(2): 1097-1105
12	Chollet F. Xception: Deep learning with depthwise separable convolutions. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'17), 2017, Jul 21-26, Honolulu, HI, USA. Piscataway, NJ, USA: IEEE, 2017: 1251-1258.
13	Dai J F, Qi H Z, Xiong Y W, et al. Deformable convolutional networks. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV'17), 2017, Oct 22-29, Venice, Italy, Piscataway, NJ, USA: IEEE, 2017: 764-773
14	Milletari F, Navab N, Ahmadi S A. V-net: Fully convolutional neural networks for volumetric medical image segmentation. Proceedings of the 4th International Conference on 3D Vision (3DV’16), 2016, Oct 25-28, Stanford, CA, USA. Piscataway, NJ, USA: IEEE, 2016: 565-571
15	Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV'17), 2017, Oct 22-29, Venice, Italy, Piscataway, NJ, USA: IEEE, 2017: 2999-3007
16	Ruder S. An overview of gradient descent optimization algorithms. arXiv preprint, arXiv:1609.04747, 2016
17	Garcia-Garcia A, Orts-Escolano S, Oprea S, et al. A review on deep learning techniques applied to semantic segmentation. arXiv preprint, arXiv:1704.06857, 2017

[1]	Wang Xianlun, Wang Guangyu, Cui Yuxia. Facial expression recognition based on improved ResNet[J]. 中国邮电高校学报(英文版), 2023, 30(1): 28-38.
[2]	Kong Chao, Ou Weihua, Gong Xiaofeng, Li Weian, Han Jie, Yao Yi, Xiong Jiahao. Face anti-spoofing based on multi-modal and multi-scale features fusion [J]. 中国邮电高校学报(英文版), 2022, 29(6): 73-82.
[3]	Jia Wei, Gong Chao. Precise and efficient Chinese license plate recognition in the real monitoring scene of intelligent transportation system[J]. 中国邮电高校学报(英文版), 2022, 29(3): 1-14.
[4]	Song Yue, Wu Chengmao, Tian Xiaoping, Song Qiuyu. Enhanced kernel-based fuzzy local information clustering integrating neighborhood membership [J]. 中国邮电高校学报(英文版), 2021, 28(6): 65-81.
[5]	Xue Chenzi, Wei Yifei, Zhang Yong. Performance optimization for smart grid blockchain integrated with fog computing using DDQN[J]. 中国邮电高校学报(英文版), 2021, 28(2): 68-78.
[6]	陈法权樊军. Real-time prediction of the motion tendency of human lower limbs during gait [J]. 中国邮电高校学报(英文版), 2020, 27(4): 1-7.
[7]	李端张洪欣 Muhammad Saad Khan 米芳. Recognition of motor imagery tasks for BCI using CSP and chaotic PSO twin SVM[J]. 中国邮电高校学报(英文版), 2017, 24(3): 83-90.
[8]	张捷范旭慧班登科. Smooth support vector machine based on circular tangent function[J]. 中国邮电高校学报(英文版), 2016, 23(1): 68-72.
[9]	王亮刘贵喜段红岩. Dynamic and combined gestures recognition based on multi-feature fusion in a complex environment [J]. Acta Metallurgica Sinica(English letters), 2015, 22(2): 81-88.
[10]	李岩周亚建袁开国郭玉翠钮心忻. Exposing photo manipulation with inconsistent perspective geometry[J]. Acta Metallurgica Sinica(English letters), 2014, 21(4): 83-91.
[11]	刘敬徐国胜郑世慧肖达谷利泽. Data streams classification with ensemble model based on decision-feedback[J]. Acta Metallurgica Sinica(English letters), 2014, 21(1): 79-85.
[12]	卢官明左加阔. Orthogonal isometric projection for face recognition[J]. Acta Metallurgica Sinica(English letters), 2011, 18(1): 91-97.

Semantic segmentation of track image based on deep neural network

Semantic segmentation of track image based on deep neural network

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献 17

相关文章 12

编辑推荐

Metrics

本文评价