LLM-guided Evolution for MAterials Design
Official implementation of “LLEMA: Evolutionary Search with LLMs for Multi-Objective Materials Discovery”
LLEMA is a unified framework that uses large language models (LLMs) + chemistry-informed evolutionary rules + surrogate predictors to discover novel, stable, synthesizable materials faster. It tackles the challenge of balancing conflicting objectives (e.g., bandgap vs. stability, conductivity vs. transparency) by combining reasoning, evolution, and prediction.
- LLM-driven candidate generation under property constraints
- Evolutionary memory loop with chemistry-informed operators
- Multi-objective optimization using surrogate models
- Benchmark suite of 14 materials discovery tasks across electronics, energy, aerospace, coatings, and optics
- Empirical results: higher hit rates, stronger Pareto fronts, and broader diversity.
Requirements:
- Python 3.11+
Steps:
- Clone this repository
git clone https://github.com/your-org/LLEMA.git
cd LLEMA/- Create and activate an environment
conda env create -f environment.yml # creates env defined in file
conda activate llema- Install Python dependencies
pip install --upgrade pip
pip install -r requirements.txtTo run surrogate models locally, clone their repos (see below).
You must provide API keys before running the agent:
- OPENAI for LLM calls:
OPENAI_API_KEY - Materials Project for structure/property queries:
MATERIALS_PROJECT_API_KEY
Recommended: Copy the example environment file and fill in the values.
cp env.example .env
# edit .env and set OPENAI_API_KEY and MATERIALS_PROJECT_API_KEYEnvironment variables read by LLEMA (subset):
OPENAI_API_KEY– used by the agent LLM interfaceLLM_MODEL– optional, defaults togpt-4o-miniMATERIALS_PROJECT_API_KEY– used by property extraction utilitiesSURROGATE_MODELS_DIR– optional, defaults tosrc/surrogate_models
If not using a dotenv loader, you can also export them in your shell before running:
export OPENAI_API_KEY=...
export MATERIALS_PROJECT_API_KEY=...Note: The src/agent/config.py file contains run-specific information such as iteration limits, memory settings, and multi-island configuration parameters.
LLEMA integrates fast surrogate models to estimate materials properties during the search loop.
- Pretrained models from JARVIS-DFT are to be downloaded and stored under
src/surrogate_models/alignn/alignn/as.ziparchives. - For details on which archives are included and local customizations, see
src/surrogate_models/README.md.
cd src/surrogate_models
git clone https://github.com/usnistgov/alignn.git- CGCNN can be used as an alternative or complementary surrogate.
- LLEMA includes minor output-format changes for clearer, property-specific CLI output.
cd src/surrogate_models
git clone https://github.com/txie-93/cgcnn.gitSee src/surrogate_models/README.md for more details on supported properties and output formats.
Run the full benchmark suite via a bash script:
cd src
bash run_all_tasks.shLLEMA provides tools to evaluate CIF files for validity (property constraints) and stability analysis on the taks in LLEMABench. This section describes how to use these evaluation scripts.
The calculate_validity.py script evaluates CIF files against task-specific property constraints to determine if they meet the requirements for a given materials discovery task.
Usage:
cd src
conda activate llema # Ensure the mat_sci environment is activated
python calculate_validity.py --tasks <task_name> [options]Examples:
# Evaluate CIF files for a specific task
python calculate_validity.py --tasks "Hard, Stiff Ceramics"
# Evaluate for all available tasks
python calculate_validity.py --tasks allArguments:
--tasks: Task name(s) to evaluate. Use"all"to process all tasks, or specify one or more task names.--cif-dir: Directory containing CIF files to process (default:example)--output-dir: Output directory for results (default: auto-generated with timestamp invalidity_output/)
Output:
- Results are saved in
validity_output/property_output_<timestamp>/directory - Each task generates a
results_<task_name>.jsonlfile containing:- Compound formula
- Calculated property values (band gap, formation energy, bulk modulus, etc.)
- Categorical constraint results (earth_abundant, non_toxic, etc.)
- Successful and failed constraint checks
- Materials API usage flag
The calculate_stability.py script analyzes the thermodynamic stability of candidates from validity analysis results by calculating energy above hull and formation energy.
Usage:
cd src
conda activate llema # Ensure the mat_sci environment is activated
python calculate_stability.py --task <task_name> [options]Arguments:
--taskor-t: Specific task name to analyze (required)--max-samplesor-n: Maximum number of samples to process per task--quietor-q: Reduce output verbosity (only show summary statistics)--output-dir: Specific validity output directory to process (default: latest)
Output:
- Summary statistics are saved in
stability_output/stability_summary_<timestamp>.json - The JSON file contains:
- Overall statistics: total candidates, valid/invalid counts, stability breakdown (stable/marginally stable/unstable/unknown)
- Task-specific breakdown with detailed statistics
- Energy above hull calculation success rates
- Materials API and surrogate model usage statistics
Note: The stability analysis script automatically searches for results in validity_output/property_output_* directories and maps CIF files from the example directory (or the directory specified during validity analysis).
@inproceedings{abhyankar2026llema,
title={LLEMA: Evolutionary Search with LLMs for Multi-Objective Materials Discovery},
author={Abhyankar, Nikhil and Kabra, Sanchit and Desai, Saaketh and Reddy, Chandan K},
booktitle={The Fourteenth International Conference on Learning Representations (ICLR)},
year={2026},
url={https://openreview.net/forum?id=TIqzhBvCNB}
}
This repository is licensed under the MIT License.
For any questions or issues, you are welcome to open an issue in this repo or contact us at nikhilsa@vt.edu and sanchit23@vt.edu.