欢迎访问《汽车安全与节能学报》,

汽车安全与节能学报 ›› 2025, Vol. 16 ›› Issue (3): 470-477.DOI: 10.3969/j.issn.1674-8484.2025.03.013

• 智能驾驶与智慧交通 • 上一篇    下一篇

基于深度强化学习的智能网联车辆盲区通行策略

黎子源1(), 刘强1,*(), 李鼎立2, 李子龙1   

  1. 1.中山大学·深圳 智能工程学院,深圳市 518107,中国
    2.中信科智联科技有限公司,重庆市 400041,中国
  • 收稿日期:2024-09-09 修回日期:2025-01-16 出版日期:2025-06-30 发布日期:2025-07-01
  • 通讯作者: 刘强,教授。E-mail:liuq32@mail.sysu.edu.cn
  • 作者简介:黎子源(1998—),男(汉),广东,硕士研究生。E-mail:lizy66@mail2.sysu.edu.cn
  • 基金资助:
    重庆市科技创新重大研发项目(CSTB2023TIAD-STX0030);广东省重点领域研发计划项目(2022B0701180001);深圳市科技计划项目(KJZD20240903103806009);深圳市科技计划项目(KCXFZ20240903093911016);2025年广东省先进制造业发展专项资金(产业基础再造)

Blind spot traffic strategy for intelligent connected vehicles based on deep reinforcement learning

LI Ziyuan1(), LIU Qiang1,*(), LI Dingli2, LI Zilong1   

  1. 1. School of Intelligent Systems Engineering, Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, China
    2. CICT Connected and Intelligent Technologies Co., Ltd, Chongqing 400041, China
  • Received:2024-09-09 Revised:2025-01-16 Online:2025-06-30 Published:2025-07-01

摘要:

为防止车辆通过视觉盲区时与盲区窜出的行人发生交通事故,提出了一种基于深度强化学习的智能网联车辆(ICV)盲区通行策略方法。针对典型的盲区场景建立了数学描述模型;兼顾车辆通行安全,效率和舒适3个指标,基于“深度双Q网络”(DDQN)计了深度强化学习模型,该模型采用即碰时间(TTC)的指标,建立了一套具有物理可解释性的奖励函数,模型输出为车辆的油门和刹车踏板深度。在3个典型场景中开展了车辆通行仿真实验,验证算法的有效性。 结果表明:与传统的DQN方法相比,本方法提高了决策精度,舒适性平均提升50%以上。因此,本方法能够实现安全、高效且舒适的纵向决策。

关键词: 智能网联车辆(ICV), 深度强化学习, 行人避撞, 即碰时间(TTC)

Abstract:

A blind spot passing strategy method was proposed by using the deep reinforcement learning for intelligent connected vehicles (ICV) to prevent traffic accidents between vehicles and pedestrians when passing through visual blind spots. A mathematical description model was established for typical blind spot scenarios considering three indicators of safety, efficiency and comfort; while a deep reinforcement learning model was designed based on the Double DQN (double deep Q-network) with the TTC (time to collision) indicator to establish a set of physically interpretable reward functions, with the output being the vehicle's accelerator and the brake pedal depth. Simulation experiments were conducted under three scenarios to assess the algorithm efficacy. The results show that the simulation experiments verify the effectiveness of the algorithm. The comfort is increased by more than 50% on average of this method, compared with the traditional DQN method. The method improves decision-making accuracy. Therefore, the longitudinal decision-making method achieves the safety, the efficient and the comfortable.

Key words: intelligent connected vehicle (ICV), deep reinforcement learning, pedestrian collision avoidance, time to collision (TTC)

中图分类号: