Vehicle-following system based on deep reinforcement learning in marine scene

doi:10.19682/j.cnki.1005-8885.2022.0025

中国邮电高校学报(英文) ›› 2022, Vol. 29 ›› Issue (5): 10-20.doi: 10.19682/j.cnki.1005-8885.2022.0025

所属专题： Special Topic on Artificial Intelligence of Things

• Special Topic: Artificial Intelligence of Things • 上一篇下一篇

Vehicle-following system based on deep reinforcement learning in marine scene

张新,娄皓然,蒋励,肖前浩,蔡著文

西安邮电大学

收稿日期:2022-04-19 修回日期:2022-09-06 出版日期:2022-10-31 发布日期:2022-10-28
通讯作者: 娄皓然 E-mail:871025829@qq.com
作者简介:2022-05-31
基金资助:
《动力轴系的振动参数和能效指标的远程分析与监控系统技术研发》;《扭矩动态测量及准确性验证技术研究》

Vehicle-following system based on deep reinforcement learning in marine scene

Zhang Xin, Lou Haoran, Jiang Li, Xiao Qianhao, Cai Zhuwen

Received:2022-04-19 Revised:2022-09-06 Online:2022-10-31 Published:2022-10-28
Contact: Hao-Ran Lou E-mail:871025829@qq.com
Supported by:
Remote analysis and monitoring system technology development of vibration parameters and energy efficiency index of power shafting;Research on Torque Dynamic Measurement and Accuracy Verification Technology

摘要/Abstract

摘要：

In order to solve the problems that the feature data type are not rich enough in the data collection process about the vehicle-following task in marine scene which results in a long model convergence time and high training difficulty, a two-stage vehicle-following system was proposed. Firstly, semantic segmentation model predicts the number of pixels of the followed target, then the number of pixels of the followed target is mapped to the position feature. Secondly, deep reinforcement learning algorithm enables the control equipment to make decision action, to ensure that two moving objects remain within the safe distance. The experimental results show that the two-stage vehicle-following system has a 40% faster convergence rate than the model without position feature, and the following stability is significantly improved by adding the position feature.

关键词: vehicle-following| semantic segmentation| reinforcement learning| double deep Q-network (DDQN)

Abstract:

Key words: vehicle-following| semantic segmentation| reinforcement learning| double deep Q-network (DDQN)

Zhang Xin, Lou Haoran, Jiang Li, Xiao Qianhao, Cai Zhuwen. Vehicle-following system based on deep reinforcement learning in marine scene[J]. The Journal of China Universities of Posts and Telecommunications, 2022, 29(5): 10-20.

参考文献

[1]CHANDLER R E, HERMAN R, MONTROLL E W. Traffic dynamics: studies in car following. Operations Research, 1958,6(2): 165 -184.

[2]ZHANG Y Y, ZHANG S, ZHANG Y, et al. Multi-modality fusion perception and computing in autonomous driving. Journal of Computer Research and Development, 2020, 57 (9): 1781 - 1799 (in Chinese).

[3]YAO S S, WANG S X, DAI J C, et al. Semantic information processing in industrial networks. The Journal of China Universities of Posts and Telecommunications, 2022, 29 (1): 41 -49.

[4]HANE C, HENG L, LEE G H, et al. 3D visual perception for self-driving cars using a multi-camera system: calibration, mapping, localization, and obstacle detection. Image and Vision Computing, 2017, 68: 14 -27.

[5]XU X, ZUO L, LI X, et al. A reinforcement learning approach to autonomous decision making of intelligent vehicles on highways.

IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2018, 50(10): 3884 -3897.

[6]COLOMBARONI C, FUSCO G, ISAENKO N. Modeling car following with feed-forward and long-short term memory neural networks. Transportation Research Procedia, 2021, 52: 195 -202.

[7]ARULKUMARAN K, DEISENROTH M P, BRUNDAGE M, et al. Deep reinforcement learning: a brief survey. IEEE Signal Processing Magazine, 2017, 34(6): 26 -38.

[8]MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning. Nature, 2015, 518(7540): 529 -533.

[9]LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning. arXiv Preprint, arXiv:

1509. 02971, 2015.

[10] VAN HASSELT H, GUEZ A, SILVER D. Deep reinforcement learning with double Q-learning. Proceedings of the 30th AAAI

Conference on Artificial Intelligence (AAAI'16), 2016, Feb 12 -17, Phoenix, AZ, USA. Menlo Park, CA, USA: AAAI, 2016: 2094 -

2100.

[11] BADRINARAYANAN V, KENDALL A, CIPOLLA R. SegNet: a deep convolutional encoder-decoder architecture for image

segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481 -2495.

[12] MINAEE S, BOYKOV Y, PORIKLI F, et al. Image segmentation using deep learning: a survey. IEEE Transactions on Pattern

Analysis and Machine Intelligence, 2022, 44(7): 3523 -3542.

[13] GARCIA-GARCIA A, ORTS-ESCOLANO S, OPREA S, et al. A review on deep learning techniques applied to semantic segmentation. arXiv Preprint, arXiv: 1704. 06857, 2017.

[14] LANGKVIST M, KISELEV A, ALIREZAIE M, et al. Classification and segmentation of satellite orthoimagery using convolutional neural networks. Remote Sensing, 2016, 8 (4): Article 329.

[15] WANG C, BU L P. Survey of object detection algorithms based on convolutional neural networks. Ship Electronic Engineering,

2021, 41(9): 161 -169 (in Chinese).

[16] GHOSH S, CHAKI A, SANTOSH K C. Improved U-Net architecture with VGG'16 for brain tumor segmentation. Physical and Engineering Sciences in Medicine, 2021, 44(3): 703 -712.

[17] ZHANG Y J, CHEN C, WANG Z Y. Research on activation function of deep learning algorithms. Radio Communication Technology, 2021, 47(1): 115 -120 (in Chinese).

[18] VIGUIER A, CLEMENT G, TROTTER Y. Distance perception within near visual space. Perception, 2001, 30(1): 115 -124.

[19] MYUNG I J. Tutorial on maximum likelihood estimation. Journal of Mathematical Psychology, 2003, 47(1): 90 -100.

[20] WANG T, MA C Y. The probability density of discrete random variables and its application. Journal of North China University of

Science and Technology, 2010, 7(1): 88 -89 (in Chinese).

[21] YANG Y, LI J T, PENG L L. Multi-robot path planning based on a deep reinforcement learning DQN algorithm. CAAI Transactions on Intelligence Technology, 2020, 5(3): 177 -183.

[22] BULUT V. Optimal path planning method based on epsilon-greedy Q-learning algorithm. Journal of the Brazilian Society of Mechanical Sciences and Engineering, 2022, 44(3): Article 106.

[23] SURI R E. TD models of reward predictive responses in dopamine neurons. Neural Networks, 2002, 15(4/5/6): 523 -533.

[24] FU Y, CAI X F. Socket network programming: based on TCP protocol or UDP protocol. China New Communication, 2020,

22(8): 57 -58.

[25] CAO Z C, XU X W, HU B, et al. Rapid detection of blind roads and crosswalks by using a lightweight semantic segmentation

network. IEEE Transactions on Intelligent Transportation Systems, 2021, 22(10): 6188 -6197.

[26] BLUMER A, EHRENFEUCHT A, HAUSSLER D, et al. Occam's razor. Information Processing Letters, 1987, 24(6): 377 -380.

[27] WEI Z S, JIANG Y, LIAO X S, et al. End-to-end vision-based adaptive cruise control (ACC) using deep reinforcement learning.

The Transportation Research Board (TRB) 99st Annual Meeting, 2020, Jan 12 -16, Washington DC, USA. 2020.

[28] BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: Optimal speed and accuracy of object detection. arXiv Preprint, arXiv: 2004. 10934, 2020.

Vehicle-following system based on deep reinforcement learning in marine scene

Vehicle-following system based on deep reinforcement learning in marine scene

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 12

编辑推荐

Metrics

本文评价

[1]	Zhu Ruijie, Li Gong, Wang Peisen, Zhang Wenchao. Reinforced virtual optical network embedding algorithm in EONs for edge computing [J]. 中国邮电高校学报(英文版), 2022, 29(6): 18-29.
[2]	Wang Hang, Li Xi, Ji Hong, Zhang Heli. QoE-based video segments caching strategy in urban public transportation system[J]. 中国邮电高校学报(英文版), 2021, 28(4): 29-38.
[3]	郑颖孙司远魏翼飞宋梅. User association and resource allocation in green mobile edge networks using deep reinforcement learning[J]. 中国邮电高校学报(英文版), 2021, 28(3): 1-10.
[4]	Xue Chenzi, Wei Yifei, Zhang Yong. Performance optimization for smart grid blockchain integrated with fog computing using DDQN[J]. 中国邮电高校学报(英文版), 2021, 28(2): 68-78.
[5]	汪昭颖, 周军华, 廖中华, 翟翔, 张连平. Semantic segmentation of track image based on deep neural network[J]. 中国邮电高校学报(英文版), 2020, 27(5): 23-33.
[6]	秦彩王朝炜王卫东张英海. Dynamic power control for relay-aided transmission based on deep reinforcement learning[J]. 中国邮电高校学报(英文版), 2019, 26(3): 35-43.
[7]	Xue Chong, Jia Peng, Zhang Xinyu. Steering control in autonomous vehicles using deep reinforcement learning[J]. 中国邮电高校学报(英文版), 2018, 25(6): 58-64.
[8]	李齐山胡智群温向明路兆铭亓航. On-line learning algorithm for dynamic sensitivity control in IEEE 802.11ax network[J]. 中国邮电高校学报(英文版), 2018, 25(5): 67-74.
[9]	孙昱婧王永斌李屹. Energy efficiency enhancement in heterogeneous networks: a joint resource allocation approach[J]. Acta Metallurgica Sinica(English letters), 2015, 22(4): 74-80.
[10]	肖丁石川. Autonomic discovery of subgoals in hierarchical reinforcement learning[J]. Acta Metallurgica Sinica(English letters), 2014, 21(5): 94-104.
[11]	史雪飞 Wang Zhi-liang PING An ZHANG Li-kun. Artificial emotion model based on reinforcement learning mechanism of neural network[J]. Acta Metallurgica Sinica(English letters), 2011, 18(3): 105-109.
[12]	张永靖;LIN Yue-wei. Markov game for autonomic joint radio resource management in a multi-operator scenario[J]. Acta Metallurgica Sinica(English letters), 2007, 14(3): 48-55.