中国邮电高校学报(英文) ›› 2024, Vol. 31 ›› Issue (4): 43-53.doi: 10.19682/j.cnki.1005-8885.2024.1015

• Artificial Intelligence • 上一篇    下一篇

Improving Link Prediction Models through a Performance Enhancement Scheme: A Study on Semi-Supervised Learning and Model Soup

亓东林1,陈曙东2,杜蓉2,佟达1,余泳1   

  1. 1. 中国科学院微电子研究所;中国科学院大学
    2. 中国科学院微电子研究所
  • 收稿日期:2023-06-26 修回日期:2024-03-27 出版日期:2024-08-31 发布日期:2024-08-31
  • 通讯作者: 陈曙东 E-mail:chenshudong@ime.ac.cn

Improving Link Prediction Models through a Performance Enhancement Scheme: A Study on Semi-Supervised Learning and Model Soup

  • Received:2023-06-26 Revised:2024-03-27 Online:2024-08-31 Published:2024-08-31
  • Contact: shudong chen E-mail:chenshudong@ime.ac.cn

摘要: 目前,大多数构建的知识图谱无论其规模如何,大多有不完备性问题。这种不完备性会对基于知识图谱的应用产生负面影响。作为知识图谱补充的重要方法,链接预测近年来已成为热门研究课题。本文提出了一种基于半监督学习和模型汤思想的链接预测模型性能增强方案,通过对模型架构进行微小改变,有效提高了几种主流链接预测模型的性能。这一创新方案主要包括两个部分:(1)使用半监督学习策略预测图中的潜在事实三元组,(2)创造性地结合半监督学习和模型汤,进一步提高最终模型的性能,而不增加显著的计算开销。我们通过实验证实了该方案在各种链接预测模型上的有效性,特别是在具有密集关系的数据集上。对于测试的模型中整体性能最佳的模型CompGCN,在经过增强方案后,在FB15K-237数据集上的Hits@1指标提高了14.7%,在WN18RR数据集上提高了7.8%。同时,我们观察到增强方案中的半监督学习策略对于多类链接预测模型有显著改进,并且模型汤的引入带来的性能改进与具体的测试模型有关,某些模型的性能得到了改善,而其他模型的性能基本保持不变。

关键词: 自然语言处理,知识图谱,链接预测,模型汤,半监督学习

Abstract: As a fact, most constructed knowledge graphs are far from complete no matter its size. This incompleteness will cause negative influence on the applications based on knowledge graphs. As an important method for knowledge graph complementation, link prediction has become a hot research topic in recent years. In this paper, a performance enhancement scheme for link prediction models based on the idea of semi-supervised learning and model soup is proposed, which effectively improves the model performance on several mainstream link prediction models with small changes to their architecture. This novel scheme consists of two main parts: (1) predicting potential fact triples in the graph with semi-supervised learning strategies, (2) creativily combining semi-supervised learning and model soup to further improve the final model performance without adding significant computational overhead. We experimentally validate the effectiveness of the scheme for a variety of link prediction models, especially on the dataset with dense relationships. In terms of CompGCN, the model with the best overall performance among the tested models improves its Hits@1 metric by 14.7% on the FB15K-237 dataset and 7.8% on the WN18RR dataset after using the enhancement scheme. Meanwhile, we observe that the semi-supervised learning strategy in the augmentation scheme has significant improvement for multi-class link prediction models, and the performance improvement brought by the introduction of the model soup is related to the specific tested models, because performance of some models are improved while others remained largely unaffected.

Key words: natural language processing, knowledge graph, link prediction, model soup, semi-supervised learning