欢迎访问《汽车安全与节能学报》,

汽车安全与节能学报 ›› 2023, Vol. 14 ›› Issue (4): 472-479.DOI: 10.3969/j.issn.1674-8484.2023.04.009

• 智能驾驶与智慧交通 • 上一篇    下一篇

基于集成深度强化学习的自动驾驶车辆行为决策模型

张新锋(), 吴琳   

  1. 长安大学 汽车学院,西安 710064,中国
  • 收稿日期:2023-01-19 修回日期:2023-04-11 出版日期:2023-08-31 发布日期:2023-08-31
  • 作者简介:张新锋(1976—),男(汉),陕西,副教授。E-mail:zhxf@chd.edu.cn
  • 基金资助:
    陕西省重点研发计划项目(2022GY-303);西安市科技计划项目(2022GXFW0152)

Behavior decision-making model for autonomous vehicles based on an ensemble deep reinforcement learning

ZHANG Xinfeng(), WU Lin   

  1. School of Automobile, Chang’an University, Xi’an 710064, China
  • Received:2023-01-19 Revised:2023-04-11 Online:2023-08-31 Published:2023-08-31

摘要:

提出一种基于集成的深度强化学习的自动驾驶车辆的行为决策模型。基于Markov决策过程(MDP)理论,采用标准投票法,将深度Q学习网络 (DQN)、双DQN (DDQN)和竞争双DDQN (Dueling DDQN)等3种基础网络模型集成。在高速公路仿真环境、在单向3车道、4车道、5车道数量场景下,对向左换道、车道保持、向右换道、同车道加速和减速等5种车辆驾驶行为,进行测试和泛化性验证。结果表明:与其它3种网络模型相比,该模型的决策成功率分别提高了6%、3%和6%;平均车速也有提升;100回合的测试,耗时小于1 ms,满足决策实时性要求。因而,该决策模型提高了行车安全和决策效率。

关键词: 自动驾驶, 深度强化学习, 集成学习, 深度Q网络(DQN), 标准投票法

Abstract:

A behavior decision-making model for autonomous vehicles was proposed based on an ensemble deep reinforcement learning method. The decision model was constructed based on the Markov decision process (MDP) theory. Three basic network models were integrated, including the Deep Q-learning Network (DQN), the Double DQN (DDQN), and the Dueling double DDQN (Dueling DDQN), by using the Standard Voting Method. Some tests and the generalization validation tests were done, for 5 vehicle driving behaviors, including the lane changing to the left, the lane keeping, the lane changing to the right, the accelerating in the same lane, and the decelerating in the same lane, in highway simulation environments under the scenarios of 3-lane, 4-lane, and 5-lane in one direction. The results show that the decision success rate of the proposed model increase 6%, 3% and 6%, respectively, compare with the other three network models. The average vehicle speed has also been improved; And the 100-round tests take less than 1 ms, which meets the requirement for real-time decision-making. Therefore, the decision-making model improves driving safety and decision-making efficiency.

Key words: autonomous driving, deep reinforcement learning, ensemble learning, deep Q-network (DQN), standard voting method

中图分类号: