The pytorch implement of papar AI-accelerated Discovery of Altermagnetic Materials
some package required:
- pytorch: 2.0.1
- accelerate: 0.20.0
- pymatgen: 2023.5.10
- PyYAML: 6.0
- tqdm: 4.64.0
Other required packages are listed in the requirements.txt file.
- run files in the
preprocessdirectory, move the output files oflabel0.csvandcandidate.csvtoroot_dirdirectory - check required data files in
root_dir
atom_init.json: a JSON file that stores the initialization vector for each element.label0.csv: a CSV file that stores theIDfor the non-altermagnetic crystallabel1.csv: a CSV file that stores theIDfor the altermagnetic crystalcandidate.csv: a CSV file that stores theIDfor the crystal in candidate datasets
- run
download.pyto downloadCIFfiles of all crystals in the three CSV files from Materials Project. Download time depends on your internet speed. Once completed, the structure underroot_dirwill be
root_dir
├── atom_init.json
├── label0.csv
├── label1.csv
├── candidate.csv
├── id_prop_0.csv
├── id_prop_1.csv
├── id_prop_-1.csv
├── id0.cif
├── id1.cif
├── ...- set the configuration of accelerate in a proper path, for example
accelerate config --config_file yamls/accelerate.yaml- check and update yamls/pretrain.yaml, then run
pretrain.py
nohup sh pretrain.sh &or
accelerate launch --config_file yamls/accelerate.yaml pretrain.py --file yamls/pretrain.yaml - check and update yamls/train.yaml, then run
train.py
nohup sh train.sh &or
accelerate launch --config_file yamls/accelerate.yaml train.py --file yamls/train.yaml- check and update yamls/predict.yaml, then run
predict.py
python predict.py --file yamls/predict.yamlDownloading all CIF files of all crystals using the download.py script takes about 1 hour, depending on your network speed. Pretraining the auto-encoder model for 10 epochs with a batch size of 64 on 2 NVIDIA A100 GPUs takes over 2 days. Predicting the candidate datasets of over 42,000 samples takes about 10 seconds with a batch size of 512.
You can also directly load our trained model, which has undergone multiple iterative training processes, without pre-training and training it yourself. The weights of our classifier model can be downloaded from Google Drive. The corresponding output is presented in the out/output.csv. In addition, over three hundred additional candidate materials (unconfirmed yet by DFT calculations) were predicted by the proposed AI search engine. These candidates were listed in out/Candidate_for_DFT_validate.csv.