中国邮电高校学报(英文) ›› 2024, Vol. 31 ›› Issue (3): 43-55.doi: 10.19682/j.cnki.1005-8885.2024.1007

• Artificial Intelligence • 上一篇    下一篇

Fine-grained emotion prediction for movie and television scene images

苏志斌1,周璇烨1,刘冰2,任慧1   

  1. 1. State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, China
    2. Key Laboratory of Acoustic Visual Technology and Intelligent Control System, Communication University of China, Beijing 100024, China
    3. School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
  • 收稿日期:2022-12-27 修回日期:2023-06-29 出版日期:2024-06-30 发布日期:2024-06-30
  • 通讯作者: 任慧 E-mail:renhui@cuc.edu.cn
  • 基金资助:
    the Open Project of Key Laboratory of Audio and Video Restoration and Evaluation (2021KFKT005).

Fine-Grained Emotion Prediction for Movie and Television scene images

Su Zhibin, Zhou Xuanye, Liu Bing, Ren Hui   

  1. 1. State Key Laboratory of Media Convergence and Communication, Communication University of China, Beijing 100024, China
    2. Key Laboratory of Acoustic Visual Technology and Intelligent Control System, Communication University of China, Beijing 100024, China
    3. School of Information and Communication Engineering, Communication University of China, Beijing 100024, China
  • Received:2022-12-27 Revised:2023-06-29 Online:2024-06-30 Published:2024-06-30
  • Contact: Ren Hui E-mail:renhui@cuc.edu.cn
  • Supported by:
    the Open Project of Key Laboratory of Audio and Video Restoration and Evaluation (2021KFKT005).

摘要: For the task of content retrieval, analysis and generation of film and television scene images in the field of
intelligent editing, fine-grained emotion recognition and prediction of images is of great significance. In this paper,
the fusion of traditional perceptual features, art features and multi-channel deep learning features are used to reflect
the emotion expression of different levels of the image. In addition, the integrated learning model with stacking
architecture based on linear regression coefficient and sentiment correlations, which is called the LS-stacking
model, is proposed according to the factor association between multi-dimensional emotions. The experimental
results prove that the mixed feature and LS-stacking model can predict well on the 16 emotion categories of the self-
built image dataset. This study improves the fine-grained recognition ability of image emotion by computers, which
helps to increase the intelligence and automation degree of visual retrieval and post-production system.

关键词: fine-grained emotion prediction, movie and television scene images, stacking model, linear regression

Abstract: For the task of content retrieval, analysis and generation of film and television scene images in the field of
intelligent editing, fine-grained emotion recognition and prediction of images is of great significance. In this paper,
the fusion of traditional perceptual features, art features and multi-channel deep learning features are used to reflect
the emotion expression of different levels of the image. In addition, the integrated learning model with stacking
architecture based on linear regression coefficient and sentiment correlations, which is called the LS-stacking
model, is proposed according to the factor association between multi-dimensional emotions. The experimental
results prove that the mixed feature and LS-stacking model can predict well on the 16 emotion categories of the self-
built image dataset. This study improves the fine-grained recognition ability of image emotion by computers, which
helps to increase the intelligence and automation degree of visual retrieval and post-production system.

Key words: fine-grained emotion prediction, movie and television scene images, stacking model, linear regression