ADvisor

ADvisor tool allows the exploration of different applicabilty domain (AD) strategies to find the most suitable for each use case. Once the user has identified the preferred AD methodology, this can be applied to any query dataset. This repository provides necessary Python scripts for AD search and application, using a predefined environment and dependencies. A notebook is also included to provide an example on how to analyze results.

Citation

If you use ADvisor, please cite:

Piazza L., Poles C., Bononi G., Granchi C., Di Stefano M., Poli G., Macchia M., Tuccinardi T. ADvisor: an open-source tool for Applicability Domain definition and optimization in molecular predictive modeling. J. Chem. Inf. Model. 2025, ASAP. http://pubs.acs.org/doi/abs/10.1021/acs.jcim.5c01672

Installation

To set up the required environment, use the provided YAML file:

conda env create -f env.yml

Then, activate the environment:

conda activate env

Usage

1) AD search

Run the AD search script with Python, specifying the required inputs:

python Compare_AD_Strategies.py -train Train.csv -test Test.csv -repres RDKit-descriptors -mt regressor -test_tvc True -train_tvc True -test_pvc Pred -nj 4 -out Out1.csv

Please note that within ADvisor AD strategy the similarity formula used for regressors and classifiers is the one that performed best on average, respectively (we refer the user to the paper for further details).

2) AD application

Run the calculate AD script with Python, specifying the required inputs:

python Calculate_AD.py -train Train.csv -test Test.csv -query Query.csv -repres RDKit-descriptors -ad ComAD_AD_th-0.6_a-1_b-0.5_c-0.5_d-0.5 -mt regressor -test_tvc True -train_tvc True -test_pvc Pred -query_pvc Pred -nj 4 -out Out2.csv

Please note that the desired AD strategy to apply must be written in the same format returned by the AD search.

Input

The input files must be in CSV format and contain a column named SMILES, which represents the molecular structures. This column is necessary in all input files.
For the AD search, train and test set used to derive and validate the model must be used. For the AD application, both train and test set are necessary, together with the desired query set containing compounds to label as IN or OUT AD according to the selected strategy.
The train set must contain a column storing the experimental value or class of compounds. The test set must contain a column storing the experimental value or class, and a column storing the predicted value or class. The query set must contain a column storing the predicted value or class.
Three CSV files (Train.csv, Test.csv, Query.csv) are included in the repository for testing purposes.

Output

The AD search script will generate a CSV file with the evaluated AD methodologies ranked, together with their performance on IN and OUT subsets.
The AD application script will generate a CSV file containing the query smiles, the predictions for each query compound and a column storing the IN/OUT AD label according to the selected strategy.

Dependencies

All necessary dependencies are included in env.yml.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
AD_Indexes.py		AD_Indexes.py
Calculate_AD.py		Calculate_AD.py
Compare_AD_Strategies.py		Compare_AD_Strategies.py
Custom_Descriptors.py		Custom_Descriptors.py
MACCS_smarts.py		MACCS_smarts.py
Molecular_Representations.py		Molecular_Representations.py
PubChemFingerprints.py		PubChemFingerprints.py
Query.csv		Query.csv
README.md		README.md
Results_Analysis.ipynb		Results_Analysis.ipynb
Test.csv		Test.csv
Train.csv		Train.csv
Utils.py		Utils.py
env.yml		env.yml
license.docx		license.docx
rdkit_208-desc_list.dump		rdkit_208-desc_list.dump

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ADvisor

Citation

Installation

Usage

1) AD search

2) AD application

Input

Output

Dependencies

About

Uh oh!

Releases

Packages

Languages

MMVSL/ADvisor

Folders and files

Latest commit

History

Repository files navigation

ADvisor

Citation

Installation

Usage

1) AD search

2) AD application

Input

Output

Dependencies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages