欢迎访问《汽车安全与节能学报》,

汽车安全与节能学报 ›› 2025, Vol. 16 ›› Issue (4): 610-619.DOI: 10.3969/j.issn.1674-8484.2025.04.011

• 智能驾驶与智慧交通 • 上一篇    下一篇

端到端的多任务车辆自动驾驶行为决策模型

欧阳德霖1(), 邱一凡2, 王英臣1, 阳亮2, 闵海根3, 王文军4, 李国法1,*()   

  1. 1 重庆大学 机械与运载工程学院重庆 400044, 中国
    2 深圳大学 机械与控制工程学院人因工程研究所深圳 518060, 中国
    3 长安大学 信息工程学院西安 710021, 中国
    4 清华大学 车辆与运载学院北京 100084, 中国
  • 收稿日期:2024-12-18 修回日期:2025-02-04 出版日期:2025-08-30 发布日期:2025-08-27
  • 通讯作者: *李国法,教授。E-mail:liguofa@cqu.edu.cn
  • 作者简介:欧阳德霖(1998—),男(汉),广东,在读博士研究生。E-mail:delin.ouyang@stu.cqu.edu.cn
  • 基金资助:
    智能绿色车辆与交通全国重点实验室开放基金课题(KFZ2409);国家自然科学基金(52272421)

End-to-end decision-making model for multi-task autonomous driving

OUYANG Delin1(), QIU Yifan2, WANG Yingchen1, YANG Liang2, MIN Haigen3, WANG Wenjun4, LI Guofa1,*()   

  1. 1 College of Mechanical and Vehicle Engineering, Chongqing University, Chongqing 400044 China
    2 Institute of Human Factors and Ergonomics, College of Mechatronics and Control Engineering, Shenzhen University, Shenzhen 518060, China
    3 School of Information Engineering, Chang’an University, Xi’an 710021, China
    4 School of Vehicle and Mobility, Tsinghua University, Beijing 100084, China
  • Received:2024-12-18 Revised:2025-02-04 Online:2025-08-30 Published:2025-08-27

摘要:

针对自动驾驶决策任务中时空特征处理和任务间依赖性问题,该文提出一种基于三维窗口自注意力机制的端到端驾驶决策模型。通过窗口自注意力计算输入序列的时空特征,结合多任务学习和损失权重分配,提取驾驶视频特征并预测车速和转向角。结果表明:该模型在车辆转向角预测和速度预测的准确率分别达到了86.32%和85.36%,优于FMNet、Swin-Transformer和MobileT-DSM等模型,且计算量仅为57.48 GFLOPs,展现出更优的时空特征提取及性能与计算平衡。

关键词: 车辆自动驾驶, 决策控制, 深度学习, 多任务, 注意力机制

Abstract:

To address the challenges of spatiotemporal feature processing and inter-task dependencies in autonomous driving decision-making, this paper proposed an end-to-end driving decision model based on a 3D window self-attention mechanism. By applying window self-attention to compute the spatiotemporal features of the input sequence, and combining multi-task learning with loss weight allocation, the model effectively extracts features from driving videos and predicts vehicle speed and steering angle. The results demonstrate that the proposed model achieves prediction accuracies of 86.32% for steering angle and 85.36% for vehicle speed, outperforming models such as FMNet, Swin-Transformer, and MobileT-DSM. Moreover, it requires only 57.48 GFLOPs of computational cost, exhibiting superior spatiotemporal feature extraction as well as a better trade-off between performance and efficiency.

Key words: autonomous driving, decision-making and control, deep learning, multi-task, attention mechanism

中图分类号: