汽车安全与节能学报 ›› 2023, Vol. 14 ›› Issue (4): 472-479.DOI: 10.3969/j.issn.1674-8484.2023.04.009
收稿日期:2023-01-19
修回日期:2023-04-11
出版日期:2023-08-31
发布日期:2023-08-31
作者简介:张新锋(1976—),男(汉),陕西,副教授。E-mail:zhxf@chd.edu.cn。
基金资助:Received:2023-01-19
Revised:2023-04-11
Online:2023-08-31
Published:2023-08-31
摘要:
提出一种基于集成的深度强化学习的自动驾驶车辆的行为决策模型。基于Markov决策过程(MDP)理论,采用标准投票法,将深度Q学习网络 (DQN)、双DQN (DDQN)和竞争双DDQN (Dueling DDQN)等3种基础网络模型集成。在高速公路仿真环境、在单向3车道、4车道、5车道数量场景下,对向左换道、车道保持、向右换道、同车道加速和减速等5种车辆驾驶行为,进行测试和泛化性验证。结果表明:与其它3种网络模型相比,该模型的决策成功率分别提高了6%、3%和6%;平均车速也有提升;100回合的测试,耗时小于1 ms,满足决策实时性要求。因而,该决策模型提高了行车安全和决策效率。
中图分类号:
张新锋, 吴琳. 基于集成深度强化学习的自动驾驶车辆行为决策模型[J]. 汽车安全与节能学报, 2023, 14(4): 472-479.
ZHANG Xinfeng, WU Lin. Behavior decision-making model for autonomous vehicles based on an ensemble deep reinforcement learning[J]. Journal of Automotive Safety and Energy, 2023, 14(4): 472-479.
| 模型 | 3车道 | 5车道 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| M | Succ / % | v / (m·s-1) | t / μs | M | Succ / % | v / (m·s-1) | t / μs | ||
| DQN | 25 | 75 | 27.72 | 411 | 14 | 86 | 29.95 | 413 | |
| DDQN | 31 | 69 | 27.86 | 412 | 20 | 80 | 29.76 | 438 | |
| Dueling DDQN | 26 | 74 | 26.11 | 471 | 15 | 85 | 29.95 | 482 | |
| 本文模型 | 6 | 94 | 28.05 | 748 | 10 | 90 | 29.96 | 755 | |
| 模型 | 3车道 | 5车道 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| M | Succ / % | v / (m·s-1) | t / μs | M | Succ / % | v / (m·s-1) | t / μs | ||
| DQN | 25 | 75 | 27.72 | 411 | 14 | 86 | 29.95 | 413 | |
| DDQN | 31 | 69 | 27.86 | 412 | 20 | 80 | 29.76 | 438 | |
| Dueling DDQN | 26 | 74 | 26.11 | 471 | 15 | 85 | 29.95 | 482 | |
| 本文模型 | 6 | 94 | 28.05 | 748 | 10 | 90 | 29.96 | 755 | |
| [1] | LIU Teng, TIAN Bin, AI Yunfei, et al. Dynamic states prediction in autonomous vehicles: Comparison of three different methods[C]// 2019 IEEE Intel Transp Syst Conf (ITSC), Auckland, New Zealand. 2019: 3750-3755. |
| [2] | Gkartzonikas C, Gkritza K. What have we learned? A review of stated preference and choice studies on autonomous vehicles[J]. Transp Rese Part C: Emerg Tech, 2019, 98: 323-337. |
| [3] |
SHI Yunxia, LI Ying, FAN Jiahao, et al. A novel network architecture of decision-making for self-driving vehicles based on long short-term memory and grasshopper optimization algorithm[J]. IEEE Access, 2020, 8: 155429-155440.
doi: 10.1109/Access.6287639 URL |
| [4] | Furda A, Vlacic L. Enabling safe autonomous driving in real-world city traffic using multiple criteria decision making[J]. IEEE Intel Transp Syst Mag, 2011, 3(1): 4-17. |
| [5] | CHEN Jiajia, ZHAO Pan, LIANG Huawei, et al. A multiple attribute-based decision making model for autonomous vehicle in urban environment[C]// 2014 IEEE Intell Vehi Symp Proc, Dearborn, MI, USA. 2014: 480-485. |
| [6] |
CHONG Linsen, Abbas M M, Flintsch A M, et al. A rule-based neural network approach to model driver naturalistic behavior in traffic[J]. Transp Res Part C: Emerg Tech, 2013, 32: 207-223.
doi: 10.1016/j.trc.2012.09.011 URL |
| [7] | Barman B, Kanjilal R, Mukhopadhyay A. Neuro-fuzzy controller design to navigate unmanned vehicle with construction of traffic rules to avoid obstacles[J]. Int’l J Uncert, Fuzz Knowl-Based Syst, 2016, 24(3): 433-449. |
| [8] |
LI Sixian, ZHANG Junyou, WANG Shufeng, et al. Ethical and legal dilemma of autonomous vehicles: Study on driving decision-making model under the emergency situations of red light-running behaviors[J]. Electronics, 2018, 7(10): 264.
doi: 10.3390/electronics7100264 URL |
| [9] | Bojarski M, Del Testa D, Dworakowski D, et al. End to end learning for self-driving cars[EB/OL]. (2016-04-25). https://arxiv.org/abs/1604.07316. |
| [10] | LI Liangzhi, Ota K, DONG Mianxiong. Humanlike driving: Empirical decision-making system for autonomous vehicles[J]. IEEE Trans Vehi Tech, 2018, 67(8): 6814-6823. |
| [11] | CHEN Shitao, ZHANG Songyi, SHANG Jinghao, et al. Brain-inspired cognitive model with attention for self-driving cars[J]. IEEE Trans Cogni Deve Syst, 2017, 11(1): 13-25. |
| [12] | LIU Teng, HUANG Bing, DENG Zejian, et al. Heuristics-oriented overtaking decision making for autonomous vehicles using reinforcement learning[J]. IET Elec Syst Transp, 2020, 10(4): 417-424. |
| [13] | Mirchevska B, Pek C, Werling M, et al. High-level decision making for safe and reasonable autonomous lane changing using reinforcement learning[C]// 2018 21st Int’l Conf Intell Transp Syst (ITSC), Maui, HI, USA. 2018: 2156-2162. |
| [14] | LI Dong, ZHAO Dongbin, ZHANG Qichao. Reinforcement learning based lane change decision- making with imaginary sampling[C]// 2019 IEEE Symp Series Comput Intell (SSCI), Xiamen, China. 2019: 16-21. |
| [15] | 张鑫辰, 张军, 刘元盛, 等. 基于Dueling DDQN的无人车换道决策模型[J]. 东北师大学报:自然科学版, 2022, 54(1): 63-71. |
| ZHANG Xinchen, ZHANG Jun, LIU Yuansheng, et al. Lane-changing decision model for unmanned vehicles based on Dueling DDQN[J]. J Northeast Norm Univ: Nat Sci, 2022, 54(1): 63-71. (in Chinese) | |
| [16] |
Valiente R, Toghi B, Pedarsani R, et al. Robustness and adaptability of reinforcement learning-based cooperative autonomous driving in mixed-autonomy traffic[J]. IEEE Open J Intell Transp Syst, 2022, 3: 397-410.
doi: 10.1109/OJITS.2022.3172981 URL |
| [17] | 罗鹏, 黄珍, 秦易晋, 等. 基于DQN的车辆驾驶行为决策方法[J]. 交通信息与安全, 2020, 38(5): 67-77. |
| LUO Peng, HUANG Zhen, QIN Yijin, et al. A method of vehicle driving behavior decision based on DQN algorithm[J]. J Transp Info Safe, 2022, 38(5): 67-77. (in Chinese) | |
| [18] |
Mnih V, Kavukcuoglu K, Silver D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533.
doi: 10.1038/nature14236 |
| [19] | 刘俊峰, 陈剑龙, 王晓生, 等. 基于深度强化学习的微能源网能量管理与优化策略研究[J]. 电网技术, 2020, 44(10): 3794-3803. |
| LIU Junfeng, CHEN Jianlong, WANG Xiaosheng, et al. Energy management and optimization of multi-eneryg grid based on deep reinforcement learning[J]. Power Syst Tech, 2020, 44(10): 3794-3803. (in Chinese) | |
| [20] | Van Hasselt H, Guez A, Silver D. Deep reinforcement learning with double q-learning[C]// Proc AAAI Conf Artif Intell, Phoenix, Arizona, USA. 2016: 2094-2100. |
| [21] | WANG Ziyu, Schaul T, Hessel M, et al. Dueling network architectures for deep reinforcement learning[C]// Int’l Conf Machine Learning, New York, USA. 2016: 1995-2003. |
| [22] | 梁兵涛, 倪云峰. 基于集成学习的中文命名实体识别方法[J]. 南京师大学报(自然科学版), 2022, 45(3): 123-131. |
| LIANG Bingtao, NI Yunfeng. Chinese named entity recognition method based on ensemble learning[J]. J Nanjing Normal Univ (Nat Sci), 2022, 45(3): 123-131. (in Chinese) | |
| [23] | Carrara N, Leurent E, Laroche R, et al. Budgeted reinforcement learning in continuous state space[C]// 33rd Conf Neural Info Pro Syst (NeurIPS 2019), Vancouver, Canada. 2016, 32: 1-11. |
| [24] | Kong J, Pfeiffer M, Schildbach G, et al. Kinematic and dynamic vehicle models for autonomous driving control design[C]// 2015 IEEE Intell Vehi Symp (IV), Seoul, Korea. 2015: 1094-1099. |
| [25] | ZHOU Mofan, QU Xiaobo, JIN Sheng. On the impact of cooperative autonomous vehicles in improving freeway merging: A modified intelligent driver model-based approach[J]. IEEE Trans Intell Transp Syst, 2016, 18(6): 1422-1428. |
| [1] | 于谦, 郭圆圆, 杨鸣鹏, 张玉婷. 基于跟驰对的CO2排放特性的生态车辆跟驰策略[J]. 汽车安全与节能学报, 2025, 16(4): 577-586. |
| [2] | 欧阳德霖, 邱一凡, 王英臣, 阳亮, 闵海根, 王文军, 李国法. 端到端的多任务车辆自动驾驶行为决策模型[J]. 汽车安全与节能学报, 2025, 16(4): 610-619. |
| [3] | 黎子源, 刘强, 李鼎立, 李子龙. 基于深度强化学习的智能网联车辆盲区通行策略[J]. 汽车安全与节能学报, 2025, 16(3): 470-477. |
| [4] | 李国法, 欧阳德霖, 陈晨, 聂冰冰, 张伟, 禹慧丽, 刘斌, 张强, 王文军, 成波, 李升波. 驾驶风险监测与干预技术研究综述[J]. 汽车安全与节能学报, 2025, 16(2): 181-196. |
| [5] | 胡志龙, 裴晓飞, 周洪龙, 魏炜冉. 基于风险敏感的自动驾驶汽车分层强化学习决策[J]. 汽车安全与节能学报, 2025, 16(2): 326-333. |
| [6] | 杨俊儒, 郑四发, 许述财, 田野, 孙剑, 孙川, 李浩然. 基于OnSite平台的自动泊车测评工具的研究与设计[J]. 汽车安全与节能学报, 2025, 16(2): 334-343. |
| [7] | 杨澜, 赵祥模, 王润民, 王振, 房山, 瞿广跃. 自动驾驶认知能力测试评价研究综述[J]. 汽车安全与节能学报, 2025, 16(1): 1-15. |
| [8] | 李怡, 刘显贵, 唐耀红, 陈立沛, 陈洋睿, 游铭娴. 变曲率道路下自动驾驶小客车安全稳定跟踪控制策略[J]. 汽车安全与节能学报, 2025, 16(1): 136-147. |
| [9] | 张富椿, 尹燕莉, 马永娟, 肖杭洋, 陈海鑫, 余凯. 网联混合动力汽车队列的生态驾驶与能量管理分层控制[J]. 汽车安全与节能学报, 2025, 16(1): 159-169. |
| [10] | 刘擎超, 王瑞海, 蔡英凤, 王海, 陈龙. 基于CatBoost和SHAP的高级别自动驾驶车辆非预期停车冲突风险预测[J]. 汽车安全与节能学报, 2025, 16(1): 170-180. |
| [11] | 曹莉凌, 刘君丽, 金升烨, 曹守启, 周国峰. 面向自动驾驶的远程多维信息实时交互系统设计[J]. 汽车安全与节能学报, 2024, 15(6): 934-942. |
| [12] | 刘洋, 占佳豪, 李深, 李小鹏, 陈峻. 自动驾驶技术的未来:单车智能和智能车路协同[J]. 汽车安全与节能学报, 2024, 15(5): 611-633. |
| [13] | 瞿广跃, 杨澜, 袁梦, 房山, 刘松岩. 面向自动驾驶汽车的信号交叉口行人多模态轨迹预测方法[J]. 汽车安全与节能学报, 2024, 15(5): 689-701. |
| [14] | 柳鹏, 赵克刚, 梁志豪, 叶杰. 基于深度强化学习CLPER-DDPG的车辆纵向速度规划[J]. 汽车安全与节能学报, 2024, 15(5): 702-710. |
| [15] | 高凯, 刘健, 刘林鸿, 刘欣宇, 张金来, 杜荣华. 基于LSTM-多头混合注意力的可解释换道意图预测[J]. 汽车安全与节能学报, 2024, 15(5): 763-773. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||
