欢迎访问《汽车安全与节能学报》,

汽车安全与节能学报 ›› 2024, Vol. 15 ›› Issue (4): 591-601.DOI: 10.3969/j.issn.1674-8484.2024.04.016

• 智能驾驶与智慧交通 • 上一篇    下一篇

基于多尺度注意力机制的实时激光雷达点云语义的分割

张晨1(), 刘畅1, 赵津1, 王广玮1,2,*(), 许庆2   

  1. 1.贵州大学 机械工程学院,贵阳 550025,中国
    2.清华大学 车辆与运载学院,北京 100084,中国
  • 收稿日期:2023-12-14 修回日期:2024-03-18 出版日期:2024-08-31 发布日期:2024-09-05
  • 通讯作者: *王广玮,副教授。E-mail:gwwang@gzu.edu.cn
  • 作者简介:张晨(1998—),男(汉),山东,硕士研究生。E-mail:gschenzhang21@gzu.edu.cn
  • 基金资助:
    国家自然科学基金地区项目(52265070);贵州省科技支撑计划项目(黔科合支撑[2022]一般045);贵州省创新人才团队项目(CXTD2022-009)

Semantic segmentation of real-time LiDAR point clouds based on multi-scale self-attention

ZHANG Chen1(), LIU Chang1, ZHAO Jin1, WANG Guangwei1,2,*(), XU Qing2   

  1. 1. School of Mechanical Engineering, Guizhou University, Guiyang 550025, China
    2. School of Vehicle and Mobility, Tsinghua University, Beijing 100084, China
  • Received:2023-12-14 Revised:2024-03-18 Online:2024-08-31 Published:2024-09-05

摘要:

为既能提高分割精度,又能克服车载计算资源局限,提出一种面向移动机器人平台的车载实时点云语义分割方法,并进行了综合实验。该方法采用基于投影的激光雷达语义分割方法,将三维点云投影到球面图像,并结合二维卷积进行分割。引入多头注意力机制(MHSA),实现轻量级语义分割模型,以一种全新的方式,将一种深度学习模型架构Transformer映射到卷积。将Transformer的MHSA迁移至卷积,以形成多尺度自注意力机制(MSSA)。结果表明:与当前主流方法CENet、FIDNet 、PolarNet相比,本方法在NVIDIA JETSON AGX Xavier计算平台上保持了较高的分割精度(平均交并比为63.9%)及较高的检测速率(41 帧/s),从而证明了其对移动机器人平台的适用性。

关键词: 移动机器人平台, 激光雷达(LiDAR), 点云, 多尺度注意力机制(MSSA), 语义分割方法TRANSFORMER, 卷积神经网络

Abstract:

A real-time point cloud semantic segmentation method was proposed for mobile robot platforms through digital experiments, to enhance segmentation accuracy within the constraints of in-vehicle computing resources. The approach used a projection-based LiDAR technique, projecting the 3-D point cloud onto a spherical image and applying 2-D convolution. The approach integrated the multi-head self-attention (MHSA) mechanism, adapting the Transformer, a software semantic segmentation, architecture into convolution operations to build a multi-scale self-attention (MSSA) framework. The results show that on the NVIDIA JETSON AGX Xavier computing platform, the proposed method achieves a high segmentation accuracy with the mean ratio of Intersection to Union (mIoU) being 63.9%, and a fast detection speed of 41 frame/s, compared to state-of-the-art methods like the CENet, the FIDNet, and the PolarNet, therefore, demonstrating the effectiveness of the mobile robot platforms.

Key words: mobile robot platforms, light detection and ranging (LiDAR), point cloud, multi-scale self-attention (MSSA), semantic segmentation TRANSFORMER, convolutional neural networks

中图分类号: