图书情报知识 ›› 2024, Vol. 41 ›› Issue (2): 110-120.doi: 10.13366/j.dik.2024.02.110

• 情报、信息与共享 • 上一篇    下一篇

个人通信数据的敏感性识别与隐私计量研究

臧国全1,2, 张盼盼1, 柴文科1, 梁耀娣1   

  1. 1.郑州大学信息管理学院,郑州,450001;
    2.郑州市数据科学研究中心,郑州,450001
  • 出版日期:2024-03-10 发布日期:2024-05-14
  • 通讯作者: 柴文科(ORCID:0009-0000-0883-0277),硕士研究生,研究方向:数据隐私,Email:chaiwenke2022@163.com。
  • 作者简介:臧国全(ORCID:0000-0002-9606-6455),博士,教授,研究方向:数据隐私,Email:zangguoquan@zzu.edu.cn;张盼盼(ORCID:0009-0008-9164-5440),硕士研究生,研究方向:数据隐私,Email:zhangpanpan@gs.zzu.edu.cn;梁耀娣(ORCID:0009-0009-8758-3176),硕士研究生,研究方向:数据隐私,Email:2810581264@qq.com。
  • 基金资助:
    本文系国家社科基金重大项目“政府数据的隐私风险计量与保护机制创新研究”(21&ZD338)研究成果之一。

Sensitivity Identification and Privacy Measurement of Personal Communication Data

ZANG Guoquan1,2, ZHANG Panpan1, CHAI Wenke1, LIANG Yaodi1   

  1. 1.School of Information Management of Zhengzhou University, Zhengzhou, 450001;
    2.Zhengzhou Data Science Research Center, Zhengzhou, 450001
  • Online:2024-03-10 Published:2024-05-14
  • Contact: Correspondence should be addressed to CHAI Wenke, Email:chaiwenke2022@163.com, ORCID:0009-0000-0883-0277
  • Supported by:
    This is an outcome of the Major Project "Research on the Innovation of Risk Measurement and Protection Mechanism of Government Data Privacy"(21&ZD338)supported by National Social Science Foundation of China.

摘要: [目的/意义]相关法律法规和通信数据行业标准中,将个人通信数据划分为四个等级,但缺失定量研究支撑,本文定量测度通信隐私值,解决该问题。[研究设计/方法]首先归纳通信隐私文本类型并建立通信隐私文本库,其次构建通信敏感词表,进行通信数据的敏感性识别,最后通过设计隐私计量模型,对通信隐私进行计量。[结论/发现]隐私性从高到低依次为:通信内容数据、统计分析数据、个人相关数据、通信衍生数据、通信地址数据。[创新/价值]基于通信隐私文本,识别通信敏感数据,计量通信隐私值,对基于主观的隐私主体敏感性认知测度方法进行补充,为个人通信数据分级保护提供定量依据。

关键词: 通信敏感数据, 通信敏感词表, 通信隐私计量, 通信隐私文本, 通信敏感数据单元

Abstract: [Purpose/Significance] According to the relevant laws, regulations and industry standards for communication data, personal communication data is divided into four levels. However, there is a lack of quantitative research support for this classification. This paper aims to solve the problem by quantitative measurement of communication privacy value. [Design/Methodology] Firstly, we summarize the types of communication privacy text and build the communication privacy text database. Secondly, a communication sensitive vocabulary is constructed to identify the sensitivity of communication data. Finally, a privacy measurement model is designed to measure communication privacy. [Findings/Conclusion] According to the size of communication privacy value, the sorting results of personal communication data are: communication content data, statistical analysis data, personal related data, communication derived data, and communication address data. [Originality/Value] Based on the communication privacy text, this study identifies communication sensitive data, measures the communication privacy value. This paper supplements the subjective cognition measurement method of privacy subject sensitivity, and provides a quantitative basis for the hierarchical protection of personal communication data.

Key words: Communication sensitive data, Communication sensitive vocabulary, Communication privacy measurement, Communication privacy text, Communication sensitive data unit