图书情报知识 ›› 2024, Vol. 41 ›› Issue (2): 150-160.doi: 10.13366/j.dik.2024.02.150

• 知识、学习与管理 • 上一篇    


杨恒1,2, 刘凤红1,2,3   

  1. 1. 中国科学院文献情报中心,北京,100191;
    2. 中国科学院大学经济与管理学院信息资源管理系,北京,100191;
  • 出版日期:2024-03-10 发布日期:2024-05-14
  • 通讯作者: 刘凤红(ORCID:0000-0002-3633-1464),博士,研究馆员/编审,研究方向:数据出版、科学数据管理,Email:liufh@mail.las.ac.cn。
  • 作者简介:杨恒(ORCID:0000-0002-8549-3945),博士研究生,研究方向:数字出版与传播、科学数据管理,Email:yangheng@mail.las.ac.cn。
  • 基金资助:

A Preliminary Study on the FAIRification Characteristics of China's National Scientific Data Center from the Perspective of Policy Text

YANG Heng1,2, LIU Fenghong1,2,3   

  1. 1. National Science Library, Chinese Academy of Sciences, Beijing,100191;
    2. Department of Information Resources Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing, 100191;
    3. School of Information Management, NanJing University, Nanjing, 210023
  • Online:2024-03-10 Published:2024-05-14
  • Contact: Correspondence should be addressed to LIU Fenghong, Email:liufh@mail.las.ac.cn, ORCID: 0000-0002-3633-1464
  • Supported by:
    This is an outcome of the project "Effects of Data Papers on Data Sharing and Reuse"(2019M651797) supported by Post-doctoral Science Foundation of China, and the project "Publishing Model linked Research Paper and Scientific Data and its application in FAIR-compliant Manner"(23BXW097)supported by National Social Science Foundation of China.

摘要: [目的/意义]对我国国家科学数据中心数据政策的FAIR化特征进行探索,为我国数据中心的数据管理政策制定和工作优化提供初步参考。[方法/过程]综合运用网络调研和文本挖掘的方法,使用KH Coder内容挖掘软件对20家数据中心的79部数据政策进行量化文本分析。通过对FAIR原则在政策文本中的出现频次和高相似词汇的分析,揭示FAIR原则在各个数据中心、不同类型政策文本中表现出的关注度差异与语义特征。[结果/结论]数据中心的数据政策已体现了一定的FAIR原则理念,但对每项FAIR原则的关注度不均衡;不同类型的数据政策关注FAIR原则的不同方面,共性在于对可发现原则和可互操作原则比较关注;对元数据给予了重点关注。[创新/价值]建议数据中心在数据政策制定中突出“元数据”在数据全生命周期管理中的作用,推动“数据增值驱动”的数据政策体系构建,并立足我国科学数据管理实际,适度引入FAIR原则。

关键词: 科学数据管理, FAIR原则, 国家科学数据中心, 文本挖掘

Abstract: [Purpose/Significance] This paper explores the FAIRification characteristics of data policy of National Scientific Data Center in China, aiming to provide a preliminary reference of data management policy formulation and work optimization for them. [Design/Methodology] This paper comprehensively used the methods of network research and text mining. The content mining software, KH Coder, was employed to conduct quantitative text analysis of 79 data policies from 20 data centers. Through analyzing the frequency of FAIR principle appear in these policy texts and the words with high similarity used in FAIR principle of these policy texts, we revealed the attention difference and semantic feature of FAIR principle in different data centers and different types of policy texts. [Findings/Conclusion] The results show that the data policies of data centers have reflected some FAIR principles, but the attention to each principle is not balanced. Different types of data policies focus on different aspects of the FAIR principle, and the commonality lies in the findable principle and interoperable principle and a strong emphasis is given to metadata. [Originality/Value] This paper suggests that in the development of data policy, National Scientific Data Centers should highlight the role of "metadata" in data lifecycle management, promote the construction of data policy system driven by "data value-added" and appropriately introduce the FAIR principle based on the scientific data management practice in China.

Key words: Scientific data management, FAIR principles, National Scientific Data Center, Text mining