图书情报知识 ›› 2016, Vol. 0 ›› Issue (1): 65-73.doi: 10.13366/j.dik.2016.01.065

• 情报、信息与共享 • 上一篇    下一篇

基于内容挖掘的国际大数据研究主题分析

董克,陶艳   

Topic Analysis of International Big Data Research Based on Content Mining

摘要:

大数据是当前高速发展的新领域,也是广受关注的研究热点之一。本文以WOS数据库中大数据研究论文为分析对象,利用斯坦福主题模型工具包对大数据研究的内容进行了挖掘。分析结果表明,当前国际大数据研究集中在大数据环境下的风险控制、大数据的核心技术、特定领域中的大数据及其应用研究等三大类十五个主题;大数据核心技术和特定领域中的大数据及其应用是当前大数据研究的主要内容;目前大数据研究的主题均处于高速发展阶段;特定领域的大数据及其应用研究增速最快,将成为未来最受关注的研究方向。

关键词: 大数据, 内容挖掘, 主题模型, 研究趋势

Abstract:

As a new developing research area big data, has gained widely attention from different fields. This paper analyzed the research articles on big data from Web of Science database using Stanford Topic Modeling Toolbox for content mining. The result show that big data research are clustered into three groups of 15 topics, i.e. groups of risk controlling under big data environment, core technologies for big data, big data in specific fields and applications. A majority of the topics are about core technologies, big data in specific fields and applications. Au of the 15 topics are developing at high speed; meanwhile, big data in specific fields and applications outstands others and is going to be the most attractive research hotspot in the future.

Key words: Big data, Content mining, Topic model, Research trend