融合GoT-SAC的领航—跟随式多车编队路径规划方法

doi:10.3969/j.issn.1674-8484.2026.01.013

汽车安全与节能学报 ›› 2026, Vol. 17 ›› Issue (1): 122-129.DOI: 10.3969/j.issn.1674-8484.2026.01.013

融合GoT-SAC的领航—跟随式多车编队路径规划方法

王越¹^,²(), 段宏伟¹^,³, 钟薇², 杨路³^,^*(), 何雷², 柴福来¹, 石晓杨¹

1.北京科技大学，机械工程学院，北京市，100083，中国
2.清华大学，智能绿色车辆与交通全国重点实验室，北京市，100084，中国
3.北京理工大学，机械与车辆学院，北京市，100081，中国

收稿日期:2025-10-04 修回日期:2025-12-14 出版日期:2026-02-28 发布日期:2026-03-19
通讯作者: *杨路，副研究员，E-mail：yanglu@bit.edu.cn。
作者简介:王越，男（汉），山东，讲师。E-mail：wangyue@ustb.edu.cn。
基金资助:
国家自然科学基金项目(52202497);智能绿色车辆与交通全国重点实验室开放基金课题(KFY2413)

Path planning method for leader-follower multi-vehicle formation with integrating GoT-SAC

WANG Yue¹^,²(), DUAN Hongwei¹^,³, ZHONG Wei², YANG Lu³^,^*(), HE Lei², CHAI Fulai¹, SHI Xiaoyang¹

1. School of Mechanical Engineering, University of Science and Technology Beijing, Beijing 100083, China
2. State Key Laboratory of Intelligent Green Vehicle and Mobility, Tsinghua University, Beijing 100084, China
3. School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081, China

Received:2025-10-04 Revised:2025-12-14 Online:2026-02-28 Published:2026-03-19

摘要/Abstract

摘要：

在多车集群协同搬运任务中，为提升未知环境下编队稳定性与作业效率，基于Mecanum Wheel智能底盘硬件平台，建立融合“目标导向型变换器（GoT）”和“柔性演员与评论家（SAC）”，GoT-SAC，的领航—跟随编队式路径规划方法；并结合微缩实验平台和Gazebo仿真环境开展实验验证。结果表明：本模型在训练约95~100回合后达到收敛稳定区；与人工遥控开环同步策略相比，本方法后编队的平均相对位姿误差由18 cm降低至6 cm，路径长度差异小于5%。从而，本方法在不依赖先验地图环境中，可较好地实现稳定编队和高效避障行驶。

关键词: 多车编队控制, 领航—跟随编队式, 路径规划, 深度强化学习

Abstract:

A leader-follower formation path planning method was proposed through integrating the Goal-oriented Transformer (GoT) and the Soft Actor-Critic (SAC) on the Mecanum Wheeled intelligent platform, named GoT-SAC, to enhance the stability and efficiency of formation operation in unknown environments. Experimental validation was conducted in both the Gazebo environment and on a miniature physical platform. The results show that the GoT-SAC model convergences within 95~100 training episodes. The average relative pose error reduces from 18 cm to 6 cm with a path-length relative-difference being below 5% compared with the manual remote-control strategy. Therefore, the proposed method achieves stable formation and efficient obstacle avoidance without relying on prior map information.

Key words: multi-vehicles formation control, leader-follower formation, path planning, deep reinforcement learning

中图分类号:

U467.1+4

王越, 段宏伟, 钟薇, 杨路, 何雷, 柴福来, 石晓杨. 融合GoT-SAC的领航—跟随式多车编队路径规划方法[J]. 汽车安全与节能学报, 2026, 17(1): 122-129.

WANG Yue, DUAN Hongwei, ZHONG Wei, YANG Lu, HE Lei, CHAI Fulai, SHI Xiaoyang. Path planning method for leader-follower multi-vehicle formation with integrating GoT-SAC[J]. Journal of Automotive Safety and Energy, 2026, 17(1): 122-129.

图/表 12

参考文献 19

[1]	Balakrishnan S, Azman A D, Nisar J, et al. IoT-enabled smart warehousing with AMR robots and blockchain:A comprehensive approach to efficiency and safety[C]// Proc 3rd Int’l Conf Math Modeling Computational Sci (ICMMCS), Singapore, 2023: 261-270.
[2]	程乐平, 李欢. 智能仓储物流中机器人技术的应用与发展[J]. 信息系统工程, 2023(7): 43-46.
	CHENG Leping, LI Huan. Application and development of robotic technologies in smart warehousing logistics[J]. *J Info Syst Engi*, 2023(7): 43-46. (in Chinese)
[3]	孔国杰, 冯时, 于会龙, 等. 无人集群系统协同运动规划技术综述[J]. 兵工学报, 2023, 44(1): 11-26. doi: 10.12382/bgxb.2022.0930
	KONG Guojie, FENG Shi, YU Huilong, et al. A survey on cooperative motion planning for unmanned swarm systems[J]. *Acta Armament*, 2023, 44(1): 11-26. (in Chinese)
[4]	Cruz J. Leader-follower strategies for multilevel systems[J]. *IEEE Trans Autom Contr*, 1978, 23(2): 244-255. doi: 10.1109/TAC.1978.1101716 URL
[5]	Tan K H, Lewis M A. Virtual structures for high-precision cooperative mobile robotic control[C]// Proc 1996 IEEE/RSJ Int'l Conf Intell Robo Syst (IROS 1996). Osaka, Japan: IEEE, 1996: 132-139.
[6]	Balch T, Arkin R C. Behavior-based formation control for multirobot teams[J]. *IEEE Trans Robot Autom*, 1998, 14(6): 926-939. doi: 10.1109/70.736776 URL
[7]	Olfati-Saber R, Murray RM. Distributed cooperative control of multiple vehicle formations using structural potential functions[C]// Proc IFAC World Congr, Barcelona, Spain: IFAC, 2002: 495-500.
[8]	Costa M M, Silva M F. A survey on path planning algorithms for mobile robots[C]// Proc 2019 IEEE Int’l Conf Autonom Robot Syst Competi (ICARSC). Piscataway: IEEE, 2019: 1-7.
[9]	Hart P E, Nilsson N J, Raphael B. A formal basis for the heuristic determination of minimum cost paths[J]. *IEEE Trans Syst Sci Cybern*, 1968, 4(2): 100-107. doi: 10.1109/TSSC.1968.300136 URL
[10]	Dijkstra E W. A note two problems in connection with graphs[J]. *Numer Math*, 1959, 1(1): 269-271. doi: 10.1007/BF01386390 URL
[11]	劳彩莲, 李鹏, 冯宇. 基于改进A*与DWA算法融合的温室机器人路径规划[J]. 农业机械学报, 2021, 52(1): 14-22.
	LAO Cailian, LI Peng, FENG Yu. Greenhouse robot path planning based on improved A* and DWA fusion[J]. *Trans Chin Soc Agric Engi*, 2021, 52(1): 14-22. (in Chinese)
[12]	郭烈, 齐国栋, 赵一兵, 等. 融合A*与TEB算法的机器人多任务导航调度研究[J]. 华中科技大学学报(自然科学版), 2023, 51(2): 82-88.
	GUO Lie, QI Guodong, ZHAO Yibing, et al. Multi-task navigation scheduling for robots via A and TEB fusion[J]. *J Huazhong Univ of Sci Techn (Nat Sci Edi)*, 2023, 51(2): 82-88. (in Chinese).
[13]	SUN Huihui, ZHANG Weijie, YU Runxiang, et al. Motion planning for mobile robots-Focusing on deep reinforcement learning: A systematic review[J]. *IEEE Access*, 2021, 9: 69061-69081. doi: 10.1109/ACCESS.2021.3076530 URL
[14]	LV Lihua, ZHANG Shujuan, DING Derong, et al. Path planning via an improved DQN-based learning policy[J]. *IEEE Access*, 2019, 7: 67319-67330. doi: 10.1109/ACCESS.2019.2918703
[15]	WANG Shijie, ZHENG Xiang, CAO Yuxiang, et al. A multi-target trajectory planning of a 6-DoF free-floating space robot via reinforcement learning[C]// Int’l Conf Intell Robots Syst (IROS). IEEE, 2021: 3724-3730.
[16]	FENG Zengxi, WANG Chang, AN Jianhu, et al. Emergency fire escape path planning model based on improved DDPG algorithm[J]. *J Build Engi*, 2024, 95: 110090-11011.
[17]	ZHAO Feiyu, LI Dayan, WANG Zhengxu, et al. Autonomous localized path planning algorithm for UAVs based on TD3 strategy[J]. *Sci Rep*, 2024, 14(1): 763-785. doi: 10.1038/s41598-024-51349-4
[18]	GFRERRER A. Geometry and kinematics of the Mecanum wheel[J]. *Comput Aided Geom Des*, 2008, 25(9): 784-791. doi: 10.1016/j.cagd.2008.07.008 URL
[19]	Haastrup A I, Ofuzim O W, Oladejo J A. Kinematic analysis of omnidirectional Mecanum wheeled robot[J]. *Int’l J Engi Appl Phys*, 2023, 3(1): 634-644.

输入：由K帧深度图与目标特征(相对距离、航向误差)经GoT编码得到的场景表示
输出：底盘速度指令
1	用预训练参数φ^*初始化GoT网络
2	初始化SAC算法中的Critic和Actor网络： φ, θ
3	初始化熵参数： α
4	初始化批量大小N并将回放缓冲区N←φ
5	分配目标参数：θ_targ←θ
6	对于episode = 1到E执行：
7	初始化环境状态：S_t~Env
8	初始化目标状态：S_{{goal, t}}~Env
9	对于step = 1到S执行：
10	映射目标标记：G_t = MLP(S_{{goal, t}})
11	场景表示：H_t←GoT(S_t, G_t, φ^*)
12	采样动作：A_t←π_φ(A_t\|A_t)
13	与环境交互：R_t, S_t+1, S_{{goal, t+1}}~Env
14	存储转换：D←D∪(S_t, S_{{goal, t}}, A_t, R_t, S_t+1, S_{{goal, t+1}})
15	更新Critic、Actor与目标网络
16	结束for (episode)

输入：由K帧深度图与目标特征(相对距离、航向误差)经GoT编码得到的场景表示
输出：底盘速度指令
1	用预训练参数φ^*初始化GoT网络
2	初始化SAC算法中的Critic和Actor网络： φ, θ
3	初始化熵参数： α
4	初始化批量大小N并将回放缓冲区N←φ
5	分配目标参数：θ_targ←θ
6	对于episode = 1到E执行：
7	初始化环境状态：S_t~Env
8	初始化目标状态：S_{{goal, t}}~Env
9	对于step = 1到S执行：
10	映射目标标记：G_t = MLP(S_{{goal, t}})
11	场景表示：H_t←GoT(S_t, G_t, φ^*)
12	采样动作：A_t←π_φ(A_t\|A_t)
13	与环境交互：R_t, S_t+1, S_{{goal, t+1}}~Env
14	存储转换：D←D∪(S_t, S_{{goal, t}}, A_t, R_t, S_t+1, S_{{goal, t+1}})
15	更新Critic、Actor与目标网络
16	结束for (episode)

MAX_EPISODES	1 500
LR_A	5×10^-4
LR_C	5×10^-4
TAU	0.001
GAMMA	0.999
AUTO_TUNE	True
ALPHA	1.0
ActorType	GaussianTransformer
CriticType	CNN
TransformerBlocks	3
AttentionHeads	2

MAX_EPISODES	1 500
LR_A	5×10^-4
LR_C	5×10^-4
TAU	0.001
GAMMA	0.999
AUTO_TUNE	True
ALPHA	1.0
ActorType	GaussianTransformer
CriticType	CNN
TransformerBlocks	3
AttentionHeads	2

融合GoT-SAC的领航—跟随式多车编队路径规划方法

Path planning method for leader-follower multi-vehicle formation with integrating GoT-SAC

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 12

参考文献 19

相关文章 15

编辑推荐

Metrics

本文评价

期刊信息

在线期刊

作者中心

审稿中心

联系我们

[1]	杨宗儒, 胡韫泽, 刘士琪, 关阳, 吴伟, 刘畅. 停车占位状态估计的分布式主动感知的路径规划[J]. 汽车安全与节能学报, 2026, 17(1): 140-148.
[2]	张炳力, 张智森, 张羊阳, 刘安, 许永华. 基于GA优化与路径扩展启发式采样的BI-RRT*路径规划方法[J]. 汽车安全与节能学报, 2025, 16(6): 923-933.
[3]	彭千龙, 金别树, 王建强, 王广玮. 考虑车道约束的骨架引导分层自主代客泊车路径规划方法[J]. 汽车安全与节能学报, 2025, 16(5): 784-792.
[4]	李舜酩, 王昌荣, 史文贝. 光储充移动式充电机器人研发综述[J]. 汽车安全与节能学报, 2025, 16(4): 505-520.
[5]	黎子源, 刘强, 李鼎立, 李子龙. 基于深度强化学习的智能网联车辆盲区通行策略[J]. 汽车安全与节能学报, 2025, 16(3): 470-477.
[6]	陈晓峰, 王兰文, 马果, 张垒, 鲍家定, 景晖. 考虑能耗及稳定性的无人驾驶车辆越野环境路径规划[J]. 汽车安全与节能学报, 2025, 16(3): 496-503.
[7]	匡兴红, 沈佳成. 改进北方苍鹰算法及其在智能汽车路径规划中的应用[J]. 汽车安全与节能学报, 2025, 16(1): 148-158.
[8]	张富椿, 尹燕莉, 马永娟, 肖杭洋, 陈海鑫, 余凯. 网联混合动力汽车队列的生态驾驶与能量管理分层控制[J]. 汽车安全与节能学报, 2025, 16(1): 159-169.
[9]	黄郑, 王红星, 杜彪, 高嵩, 高峰. 基于固定机巢的输变配无人机智能巡检方法[J]. 汽车安全与节能学报, 2024, 15(5): 670-679.
[10]	黄晨, 贾丁鹏, 孙晓强, 许庆. 基于周边车辆轨迹预测的智能汽车路径规划[J]. 汽车安全与节能学报, 2024, 15(5): 753-762.
[11]	李玉龙, 谢辉, 宋康. 无人驾驶公交车基于循迹误差观测和目标测量误差观测的避障路径规划算法[J]. 汽车安全与节能学报, 2024, 15(4): 579-590.
[12]	孟庆京, 司俊德, 张新钰, 孙弘麟, 王小宇, 荣松松. 基于图搜索的陆空两栖平台3D路径规划算法[J]. 汽车安全与节能学报, 2024, 15(2): 253-260.
[13]	李文礼, 任勇鹏, 肖凯文, 孙圆圆. 行人过街模拟及车辆右转避障路径规划方法[J]. 汽车安全与节能学报, 2024, 15(1): 99-110.
[14]	张新锋, 吴琳. 基于集成深度强化学习的自动驾驶车辆行为决策模型[J]. 汽车安全与节能学报, 2023, 14(4): 472-479.
[15]	韩玲, 张晖, 方若愚, 刘国鹏, 朱长盛, 迟瑞丰. 基于改进深度强化学习的全局路径规划策略[J]. 汽车安全与节能学报, 2023, 14(2): 202-211.