DOCUMENTATION,INFORMATION & KNOWLEDGE ›› 2019, Vol. 0 ›› Issue (3): 101-112.doi: 10.13366/j.dik.2019.03.101

Previous Articles     Next Articles

Status and Characteristics of Scientific Data Based on DataCite

  

  • Online:2019-05-10 Published:2019-05-10

Abstract:

[Purpose/Significance]This paper intends to analyze the characteristics of massive scientific data and provide a reference for effective utilization and efficient management of scientific data. [Design/Methodology]14,835,029 pieces of metadata for scientific data were collected from the DataCite. By using statistical analysis, social network analysis and text analysis,the status and characteristics of the collected scientific data were explored from six dimensions, including time, space, topic, author, version, and utilization. [Findings/Conclusion]It has been found that scientific data increases exponentially.And data of science and engineering accounts for the majority, while data of humanity and social science occupies a relatively small part. There is a serious polarization among scientific data centers. European and American countries possess advantages in the field of open data. The development of data centers in China can't meet the scholars' needs. Authors' collaborations vary a lot in different disciplines. The number of dataset versions follows the power law distribution. Data opening and sharing can help improve scholars' impacts. [Originality/Value]This study explores the characteristics of massive scientific data comprehensively and deeply from several perspectives, summarizes the practical experience of excellent scientific data centers, and explores the development approach of scientific data management in China.