DOCUMENTATION,INFORMATION & KNOWLEDGE ›› 2016, Vol. 0 ›› Issue (3): 80-88.doi: 10.13366/j.dik.2016.03.080
Previous Articles Next Articles
Online:
Published:
Abstract:
The texts of microblog have some special characteristics, such as short and dynamic, which calls for new feature selection methods that are suitable for clustering algorithms to detect the topics from microblog texts. To address this problem, this paper utilizes the idea of co-occurrence to build the dynamic co-word network for microblog texts in timelines. In the dynamic co-word network, edge weights are decayed linearly over time. Then, the weights of text features are calculated according to the degree centrality measure of the network. The experiments are carried out on datasets that are sampled from Sina Weibo. It’s shown that the dynamic co-word network feature selection method is more suitable for extracting features of microblog texts and achieves better microblog topic detection over the conventional document frequency method.
Key words: Microblog, Topic detection, Dynamic co-word network, Feature selection, Text clustering
0 / / Recommend
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
URL: http://dik.whu.edu.cn/jwk3/tsqbzs/EN/10.13366/j.dik.2016.03.080
http://dik.whu.edu.cn/jwk3/tsqbzs/EN/Y2016/V0/I3/80