中国邮电高校学报(英文) ›› 2022, Vol. 29 ›› Issue (5): 21-29.doi: 10.19682/j.cnki.1005-8885.2022.0007

所属专题: Special Topic on Artificial Intelligence of Things

• Special Topic: Artificial Intelligence of Things • 上一篇    下一篇

Saliency guided self-attention network for pedestrian attribute recognition in surveillance scenarios

李娜,武阳阳1 ,刘颖1 ,李大湘2 ,高嘉乐1   

  1. 1. 西安邮电大学
    2. 西安邮电学院通信工程学院
  • 收稿日期:2021-05-12 修回日期:2021-09-16 出版日期:2022-10-31 发布日期:2022-10-28
  • 通讯作者: 武阳阳 E-mail:2745027040@qq.com
  • 基金资助:
    国家自然科学基金

Saliency guided self-attention network for pedestrian attribute recognition in surveillance scenarios

Li Na1,2, Wu Yangyang1, Liu Ying1,2, Li Daxiang1,2, Gao Jiale1   

  • Received:2021-05-12 Revised:2021-09-16 Online:2022-10-31 Published:2022-10-28

摘要:

Pedestrian attribute recognition is often considered as a multi-label image classification task. In order to make full use of attribute-related location information, a saliency guided sel-attention network ( SGSA-Net) was proposed to weakly supervise attribute localization, without annotations of attribute-related regions. Saliency priors were integrated into the spatial attention module ( SAM ). Meanwhile,channel-wise attention and spatial attention were introduced into the network. Moreover, a weighted binary cross-entropy loss ( WCEL) function was employed to handle the imbalance of training data. Extensive experiments on richly annotated pedestrian ( RAP) and pedestrian attribute ( PETA) datasets demonstrated that SGSA-Net outperformed other state-of-the-art methods.

关键词: pedestrian attribute recognition|saliency detection|self-attention mechanism

Abstract:

Pedestrian attribute recognition is often considered as a multi-label image classification task. In order to make full use of attribute-related location information, a saliency guided sel-attention network ( SGSA-Net) was proposed to weakly supervise attribute localization,without annotations of attribute-related regions. Saliency priors were integrated into the spatial attention module ( SAM ). Meanwhile, channel-wise attention and spatial attention were introduced into the network. Moreover, a weighted binary cross-entropy loss ( WCEL) function was employed to handle the imbalance of training data. Extensive experiments on richly annotated pedestrian ( RAP) and pedestrian attribute ( PETA) datasets demonstrated that SGSA-Net outperformed other state-of-the-art methods.

Key words: pedestrian attribute recognition|saliency detection|self-attention mechanism