图书情报知识 ›› 2021, Vol. 38 ›› Issue (2): 25-34.doi: 10.13366/j.dik.2021.02.025

• 学术聚焦 • 上一篇    下一篇

学科主题预测 | 基于LSTM神经网络的学科主题热度预测模型

霍朝光,霍帆帆,董克   

  • 出版日期:2021-03-10 发布日期:2021-05-07

The Popularity Prediction of Scientific Topics Based on LSTM

  • Online:2021-03-10 Published:2021-05-07

摘要: [目的/意义]作为科学学预测的重要组成部分,学科主题热度预测旨在揭示学术前沿和发展趋势,辅助学者发现前沿选题,支持科研管理机构科学立项。[研究设计/方法] 提出基于期刊影响因子的学科主题热度计算指标(TP-JIF),构建基于LSTM神经网络的学科主题热度预测模型(TPP-LSTM),并以LIS领域数据为例,通过时间切片的形式抽取、计算学科主题的热度序列,检验不同长度时间序列下模型的各项误差。[结论/发现] 相对于RBF-SVM、Linear-SVM、KNN、Naive Bayesian等模型,TPP-LSTM预测模型可有效表征学科主题热度时间序列的特性,当时间序列长度为4年时预测效果相对较好。[创新/价值]提出的基于期刊影响因子的学科主题热度计算指标,能够有效刻画不同学术刊物对学科影响的差异,规避了单纯依据频率计算热度的弊端;构建的学科主题热度预测模型,有效表征了学科主题的时间序列变化规律,减小了各项预测误差,预测效果较好。

关键词: 学科主题预测, 热度预测, 期刊影响因子, 长短期记忆神经网络, 图书馆与信息科学

Abstract: [Purpose/Significance]As an important part of prediction in science of science, scientific topic popularity prediction contributes to reveal hot topics and discover development trends. It is helpful for scholars to find the cutting-edge topics, and assist scientific research management institutions to fund projects reasonally.[Design/Methodology]This paper proposed a Topic Popularity Computing Model based on Journal Impact Factor (TP-JIF), and constructed a scientific topic prediction model based on LSTM. Taking LIS as an example, this study extracted the topics via LDA and author keywords, computed the popularity time series, and designed experiments to verify the model in different time lengths.[Findings/Conclusion] It is found that when comparing with the RBF-SVM, Linear-SVM, KNN, and Naive Bayesian, the prediction model of LSTM can well present the characteristics of time series for scientific topic popularity, and the prediction result turns out to be the best when the length of time series is four years.
[Originality/Value]A novel computing model of scientific topic popularity based upon journal impact factors has been proposed, which could depict the differences of journal impacts in academic fields, and avoid the disadvantages of considering frequency as the only influential factor. The proposed popularity prediction model in this study could offer an excellent representation of time-changing features for scientific topics, reduce the prediction errors, and provide good prediction.

Key words: Scientific topic prediction, Popularity prediction, Journal impact factor, Long short-term memory(LSTM), Library and information science(LIS)