Documentation, Informaiton & Knowledge ›› 2024, Vol. 41 ›› Issue (6): 141-154,165.doi: 10.13366/j.dik.2024.06.141

• Intelligence, Information & Sharing • Previous Articles     Next Articles

Construction of Health Information Portrait and the Identification of False Health Information by Integrating Social Sensing Data with Publisher's Prior Knowledge

ZHAO Youlin1,2 ,PANG Hangyuan1 ,SHI Yanqing3   

  1. 1.Business School, Hohai University, Nanjing, 211100;
    2.School of Information Management, Nanjing University, Nanjing, 210023;
    3.School of Information Management, Nanjing Agricultural University, Nanjing, 210095
  • Online:2024-11-10 Published:2025-01-04
  • Supported by:
    This is an outcome of the Special Funded Project "Research on the Construction of a Spatiotemporal Data Semantic Model for Emergency Management and the Mechanisms of Innovative Applications"(2021T140311), and the project "Research on Spatiotemporal Data Mining and Collaborative Governance Mechanisms for Environmental Pollution Incidents"(2019M650108), both supported by China Postdoctoral Science Foundation.

Abstract: [Purpose/Significance] The objective of this study is to explore how to integrate social sensing data ,containing rich individual emotion, behavior, and interaction information, with publisher's prior knowledge to enhance the accuracy of false health information recognition. [Design/Methodology] Based on the social sensing data and historical information text, this paper describes the prior knowledge of publishers about detection information. By integrating the publisher's prior knowledge, the study extracts health information features from three dimensions: publisher features, content features, and receiver behavior features. Concurrently, health information portraits are established and the Stacking ensemble learning models is used to build a False Health Information Recognition Model(FHIR_SSD&PPK), a false health information recognition model that integrates social sensing data and publisher's prior knowledge. [Findings/Conclusion] FHIR_SSD&PPK model has the best effect in identifying false health information, with an accuracy of 92.35%. The total feature importance weight of the publisher features accounts for the highest proportion, at 51.59%, among which the feature importance weight of the publisher's prior knowledge features is 44.01%, and the F-Measure increases by 2.26% compared to the model without considering the publisher's prior knowledge, indicating that the publisher's prior knowledge proposed in this article is a key feature for building an identification model. [Originality/Value] The FHIR_SSD&PPK model integrates social sensing data and publisher's prior knowledge, identifies false health information based on the Stacking ensemble learning model, optimizing the research in fine granularity and depth.

Keywords: Health information portrait, False health information, Social sensing data, Prior knowledge, Stacking ensemble learning