欢迎访问《汽车安全与节能学报》,

JASE ›› 2020, Vol. 11 ›› Issue (1): 94-101.DOI: 10.3969/j.issn.1674-8484.2020.01.010

• 汽车安全 • 上一篇    下一篇

基于深度可分离卷积和多级特征金字塔网络的行人检测

姜义成,李  凡 *#br#   

  1. (湖南大学 机械与运载工程学院,长沙 410082,中国)
  • 收稿日期:2019-08-05 出版日期:2020-03-31 发布日期:2020-04-01
  • 通讯作者: 李凡(1981—),男(汉),湖南,副教授。E-mail:lifandudu@163.com。
  • 作者简介:第一作者 / First author: 姜义成(1993—),男(汉),安徽,硕士研究生。E-mail: jiangyicheng1993@163.com。
  • 基金资助:
    国家自然科学基金资助项目( 81673996);湖南省战略性新兴产业科技攻关与重大科技成果转化项目(2018GK4004)。

Pedestrian detection based on depthwise separable convolution and multi-level feature pyramid network

JIANG Yicheng, LI Fan*   

  1. (College of Mechanical and Vehicle Engineering, Hunan University, Changsha 410082, China)
  • Received:2019-08-05 Online:2020-03-31 Published:2020-04-01

摘要: 为提高行人检测的准确率,提出一种基于卷积神经网络的行人检测方法。该方法以 YOLOv3-tiny 算法为基础,在骨干网络部分,用深度可分离卷积的网络结构代替原卷积网络结构, 加深网络深度。在检测部分,提出一种改进的多级特征金字塔网络,该网络由 8 个结构相同的使用深 度可分离卷积的特征金字塔组成,特征金字塔之间串联连接,将不同金字塔得到的相同尺寸的特征进 行融合,利用融合后的特征金字塔进行检测。在 Caltech Pedestrian数据集上进行测试。结果表明: 该方法的漏检率为 57.83%,比梯度方向直方图(HOG)方法低 32.53%,比基于深度学习的方法SA Fast-RCNN 和MS-CNN分别低 4.67%、3.21% ;运行速度为 34 ms/ 帧。因而,该方法满足了实时 性要求。

关键词: 汽车主动安全 , 行人检测 , 深度可分离卷积 , 多级特征金字塔网络 , 特征融合

Abstract:  A pedestrian detection method was proposed based on convolutional neural network to improve the accuracy of pedestrian detection. The method took a YOLOv3-tiny algorithm as a base. In the backbone network part, in order to deepen the network depth, an original convolutional network structure was replaced by a depthwise separable convolution. In the detection part, an improved multi-level feature pyramid network was proposed. The network consisted of eight feature pyramids with the same structure. The feature pyramid was also composed of depthwise separable convolutions. The feature pyramid was connected in series, features of the same size obtained by different pyramids were merged. Then the fused feature pyramid was used for detection. Tests on a Caltech Pedestrian dataset were done. The results show that the miss rate of this method is 57.83%, which is 32.53% lower than that of the histogram of oriented gradient (HOG) method, and 4.67%, 3.21% lower than that of the deep learning method SA Fast-RCNN and MS-CNN, with a running speed of 34 ms/frame. Therefore, this method meets the real-time requirement.

Key words: automotive active safety , pedestrian detection , depthwise separable convolution , multi-level feature pyramid network , feature fusio