This is a repository containing code for the paper:
J. Kim, B. Tabibian, A. Oh, B. Schölkopf, M. Gomez-Rodriguez. Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation. In Proceedings of the 11th ACM International Conference on Web Search and Data Mining (WSDM), 2018.
This code is developed under Python 3 and the following packages are required for executing the code: numpy, scipy, matplotlib, pickle, seaborn
The repository contains the code for the execution of the model (Curb) and several baseline methods. Also, it contains Jupyter notebook files for generating the figures in the paper and the user exposure data for the Twitter and Weibo datasets used in the paper.
codedirectory contains the code for executing the model and the baselines.generate_results.py: Given the user exposure data in theTwitterandWeibodirectories, it runs the models (Curband the baseline methods) and saves the results inpklfiles.curb.py: API forCurband theOraclebaseline.flagratio.py: API for theFlag Ratiobaseline.baseline.py: API for theExposurebaseline.
notebookcontainsJupyternotebook files for generating the figures in the paper. These notebooks use the results generated by the scripts in thecodedirectory.twitterandweiboreshare_datacontains user reshare logs for each story. For each txt file, each line consists of user id and timestamp of the reshare event, separated by tab.resultscontains pre-computed results forCurb, theOraclebaseline, theFlag Ratiobaseline and theExposurebaseline.
We use data from Twitter and Weibo, which includes users' networks and sharing logs, stories, and labels for the stories (whether the story is fake or genuine). The data was released together with the following paper:
S. Kwon, M. Cha, and K Jung. 2017. Rumor detection over varying time windows. PLOS ONE 12, 1 (2017), e0168344.
and it can be downloaded from the following link:
For further inquiries, please contact Jooyeon Kim (jooyeon.kim@kaist.ac.kr)