欢迎访问《汽车安全与节能学报》,

汽车安全与节能学报 ›› 2022, Vol. 13 ›› Issue (4): 750-759.DOI: 10.3969/j.issn.1674-8484.2022.04.016

• 智能驾驶与智慧交通 • 上一篇    下一篇

基于深度强化学习的高速公路换道跟踪控制模型

李文礼1(), 邱凡珂1, 廖达明2, 任勇鹏1, 易帆1   

  1. 1.重庆理工大学 汽车零部件先进制造技术教育部重点实验室,重庆 400054,中国
    2.重庆理工清研凌创测控科技有限公司,重庆 400054,中国
  • 收稿日期:2022-05-12 修回日期:2022-06-24 出版日期:2022-12-31 发布日期:2023-01-01
  • 作者简介:李文礼(1983—),男(汉),河南,副教授。E-mail: liwenli@cqut.edu.cn
  • 基金资助:
    重庆市研究生科研创新项目(gzlcx20222128);重庆市巴南区科技成果转化及产业化专项(2020TJZ022);重庆市自然科学基金面上项目(cstc2021jcyj-msxmX0183);重庆市高校创新研究群体项目(CXQT21027)

Highway lane change decision control model based on deep reinforcement learning

LI Wenli1(), QIU Fanke1, LIAO Daming2, REN Yongpeng1, YI Fan1   

  1. 1. Ministry of Education Key Laboratory of Advanced Manufacture Technology for Automobile Parts, Chongqing University of Technology, Chongqing 400054, China
    2. Chongqing University of Technology Qingyan Linktron Measurement and Control Technology Co., Ltd, Chongqing 400054, China
  • Received:2022-05-12 Revised:2022-06-24 Online:2022-12-31 Published:2023-01-01

摘要:

为解决自动驾驶汽车在高速公路安全换道问题,提出了一种基于深度强化学习算法的换道跟踪控制模型,并进行了仿真实验。采用五次多项式方法,建立车辆换道路径模型,并给出跟踪误差函数;将车辆三自由度动力学模型与深度强化学习框架相融合,搭建换道路径跟踪控制模型;通过深度确定性策略梯度(DDPG)算法来更新该模型;学习得到换道路径跟踪的最佳转向角,来控制车辆完成换道过程。结果表明:在100 km/h车速条件下,本方法控制的横向位置误差绝对值的最大值接近0,角偏差绝对值最大值为10 mrad;所提出的方法相比传统的模型预测控制方法而言,轨迹跟踪的横向位置误差和角误差更小。因而,该模型能够实现高速环境下的自主换道过程,这对保证交通安全和缓解交通有意义。

关键词: 自动驾驶汽车, 换道模型, 路径跟踪, 深度强化学习, 五次多项式, 深度确定性策略梯度(DDPG)算法

Abstract:

A lane change tracking control model was proposed based on the deep reinforcement learning algorithm and simulation experiments were carried out to solve the problem of safe lane change for automatic driving vehicles on highways. A model of the vehicle lane change path was built by using a quintuple polynomial approach with the tracking error functions. A three-degree-of-freedom vehicle dynamics model was fused with the deep reinforcement learning framework to build the lane change path tracking control model, which was updated by a deep deterministic policy gradient (DDPG) algorithm. The steering angle was learned for optimal lane change path tracking to control the vehicle complete the lane change process. The results show that at a speed of 100 km/h, the maximum value of the lateral position error absolute value is close to 0 with the maximum value of the angular deviation absolute value of 10 mrad controlled by using the proposed method; The lateral position error and angular error by using the proposed trajectory tracking method are smaller than that by using the traditional model prediction control method. Therefore, this model can achieve the lane change process autonomously in a high-speed environment, which is meaningful for ensuring traffic safety and alleviating traffic.

Key words: automatic driving vehicles, lane change model, path tracking, deep reinforcement learning, quintic polynomials, deep deterministic policy gradient (DDPG) algorithm

中图分类号: