The Journal of China Universities of Posts and Telecommunications ›› 2019, Vol. 26 ›› Issue (1): 95-104.doi: 10.19682/j.cnki.1005-8885.2019.0001

• Others • Previous Articles    

Multi-label text classification model based on semantic embedding

  

  • Received:2018-02-05 Revised:2018-12-25 Online:2019-02-26 Published:2019-02-27
  • Contact: YAN Dan-Feng E-mail:yandf@bupt.edu.cn
  • Supported by:
    National 863 project;State Grid science and technology project

Abstract: Text classification means to assign a document to one or more classes or categories according to content. Text classification provides convenience for users to obtain data. Because of the polysemy of text data, multi-label classification can handle text data more comprehensively. Multi-label text classification become the key problem in the data mining. To improve the performances of multi-label text classification, semantic analysis is embedded into the classification model to complete label correlation analysis, and the structure, objective function and optimization strategy of this model is designed. Then, the convolution neural network (CNN) model based on semantic embedding is introduced. In the end, Zhihu dataset is used for evaluation. The result shows that this model outperforms the related work in terms of recall and area under curve (AUC) metrics.

Key words: multi-label, text classification, convolution neural network, semantic analysis