中国邮电高校学报(英文版) ›› 2020, Vol. 27 ›› Issue (1): 92-99.doi: 10.19682/j.cnki.1005-8885.2020.0002

• Artificial Intelligence • 上一篇    下一篇

Image label transfer: Short video labelling by using frame auto-encoder

吕朝辉,黄诣洋   

  1. 中国传媒大学
  • 收稿日期:2019-01-31 修回日期:2019-09-19 出版日期:2020-02-28 发布日期:2020-02-28
  • 通讯作者: 吕朝辉 E-mail:llvch@hotmail.com

Image label transfer: Short video labelling by using frame auto-encoder

Chao-Hui LV 2   

  • Received:2019-01-31 Revised:2019-09-19 Online:2020-02-28 Published:2020-02-28
  • Contact: Chao-Hui LV E-mail:llvch@hotmail.com

摘要: The number of short videos on the Internet is huge, but most of them are unlabeled. In this paper, a rough labelling method of short video based on the neural network of image classification is proposed. Convolutional auto-encoder is applied to train and learn unlabeled video frames, in order to obtain the feature in certain level of the network. Using these features, we extract key-frames of the video by our method of feature clustering. We put these key-frames which represent the video content into the image classification network, so that we can get the labels for every video clip. We also compare the different architectures of convolutional auto-encoder, while optimizing and selecting the better performance architecture through our experiment result. In addition, the video frame feature from the convolutional auto-encoder is compared with those features from other extraction methods. On the whole, this paper propose a method of image labels transferring for the realization of short video rough labelling, which can be applied to the video classes with few labeled samples.

关键词: 关键帧

Abstract: The number of short videos on the Internet is huge, but most of them are unlabeled. In this paper, a rough labelling method of short video based on the neural network of image classification is proposed. Convolutional auto-encoder is applied to train and learn unlabeled video frames, in order to obtain the feature in certain level of the network. Using these features, we extract key-frames of the video by our method of feature clustering. We put these key-frames which represent the video content into the image classification network, so that we can get the labels for every video clip. We also compare the different architectures of convolutional auto-encoder, while optimizing and selecting the better performance architecture through our experiment result. In addition, the video frame feature from the convolutional auto-encoder is compared with those features from other extraction methods. On the whole, this paper propose a method of image labels transferring for the realization of short video rough labelling, which can be applied to the video classes with few labeled samples.

Key words: key-frame