By Victor Le Pochat, Tom Van Goethem, Samaneh Tajalizadehkhoob, Maciej Korczyński and Wouter Joosen
This repository contains the source code driving the generation of the Tranco ranking provided at https://tranco-list.eu/. This new top websites ranking was proposed in our paper Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation.
combined_lists.pycontains the core code for generating new lists based on a configuration passed tocombined_lists.generate_combined_list.shared.pyandglobal_config.pycontain several configuration variables;shared.DEFAULT_TRANCO_CONFIGgives the configuration of the default (daily updated) Tranco list.generate_daily_list.pyruns daily to generate the default Tranco list.job_handler.pycontains either the code for submitting jobs to anrqqueue for processing, or code to relay requests for list generation to a remote host.job_server.pyaccepts request for list generation on a remote host.notify_email.pycontains code to notify users when their list has been generated.generate_domain_parts.pypreprocesses rankings to extract the different components of domains.