This prototype predicts missing protein-protein interactions (PPIs) by:
- finding k-defective cliques of size >= q in a PPI network, and
- treating the missing edges inside those cliques as "noisy/missing" interactions.
- Python 3
bin/main_edgelist.out(binary of WODC used to enumerate k-defective cliques)
- PPI edgelist: space-separated
u vpairs, zero-based integer node IDs. - Mapping file: one protein name per line, where line i corresponds to node i.
YAL001C YBR123C YDR362C ...
From the repository:
python3 predictPPI.py PPI/edgelist.txt PPI/mapping.txt 1 11
ppi_edgelist: PPI network edgelistmapping: mapping file (line i -> protein name for node i)k: allowed number of missing edges per cliqueq: minimum clique size
--out-dir: output directory (default:output)--out-cliques: output cliques file (default:output/defective_cliques.txt)--out-pred: output predicted edges file (default:output/predicted_missing_edges.txt)--bin: path tomain_edgelist.out(default:./bin/main_edgelist.out)--keep-tmp: keep temporary files for debugging
output/defective_cliques.txt- Each line lists protein names in a defective clique, followed by its missing edges.
- Format:
protein1 protein2 ... proteinN<TAB>missing:p1-p2,p3-p4,...
output/predicted_missing_edges.txt- Each line is a predicted missing edge:
proteinA proteinB
- Each line is a predicted missing edge:
- The script cleans up temporary files by default (
tmp/andgraph.txt). - If
main_edgelist.outis missing or fails, the script raises an error.