This repository contains code to reproduce all experiments in the paper. Each experiment is self-contained with its own dependencies and executable scripts.
Note on Dependencies: Different experiments require different library versions (e.g., conflicting versions of PyTorch or SciPy). We strongly recommend creating a fresh virtual environment (Conda or venv) for each experiment folder to avoid conflicts.
Each experiment folder contains its own requirements.txt. Install dependencies for the specific experiment you want to run:
cd <folder_name>
pip install -r requirements.txtFor GPU-accelerated experiments (e.g., distinction/), also install:
pip install -r requirements-gpu.txtSome experiments (specifically in drift/, steering/, and transfer_learning/) rely on gated models hosted on Hugging Face (e.g., Llama-3, Gemma). To run these scripts, you must provide a Hugging Face authentication token with the correct permissions.
Before generating a token, ensure your Hugging Face account has accepted the license terms for the following models. You must visit each link and click "Agree" on the model card:
- Log in to Hugging Face.
- Go to Settings > Access Tokens.
- Create a new token with READ permissions.
Do not hardcode your token. Instead, export it as an environment variable before running the scripts. The code is configured to automatically detect HF_TOKEN.
# Linux/macOS
export HF_TOKEN="your_huggingface_token_here"
# Windows PowerShell
$env:HF_TOKEN = "your_huggingface_token_here"transfer_learning/ and vision_architecture/ require additional file LogME.py for the LogME scoring function sourced from the official implementation (You et al., ICML 2021, You et al., JMLR 2022).
wget https://raw.githubusercontent.com/thuml/LogME/main/LogME.py| Folder | Description | Paper Section | Notes |
|---|---|---|---|
metric_validation/ |
Shesha metric validation on embeddings | Appendix 6 | must run shesha_validation_embeddings.py before running shesha_validation.py |
distinction/ |
Ground truth validation and metric dissociation | Section 2, Appendix 7 | |
steering/ |
Representation steering (synthetic and real tasks) | Section 3.1, Appendix 8 | |
vision_architecture/ |
Vision model architecture comparisons | Section 3.2, Appendix 9 | requires LogME (see Additional Dependencies) |
drift/ |
Representational drift in language models | Section 3.3, Appendix 10 | requires Hugging Face token (see Hugging Face Configuration) |
transfer_learning/ |
Transfer learning benchmarks | Appendix 11 | requires LogME (see Additional Dependencies) |
crispr/ |
CRISPR perturbation coherence analysis | Section 3.4, Appendix 12 | |
neuroscience/ |
Neural population stability analysis | Section 3.5, Appendix 13 |
Each folder contains standalone scripts. For example:
cd distinction
python distinction_ground_truth.pyLooking to use Geometric Stability (Shesha) in your own research or production models?
You do not need to clone this repository. We maintain a production-ready, optimized Python library for that:
| Repository | Purpose | Link |
|---|---|---|
shesha (Recommended) |
📦 The Library. Use this to measure stability in your own models (LLMs, Bio, Vision). | View on GitHub |
geometric-stability |
📄 The Paper. Use this only to reproduce the specific figures/experiments from our arXiv paper. | You are here |
pip install shesha-geometryIf you use shesha-geometry, please cite:
@software{shesha2026,
title = {Shesha: Self-Consistency Metrics for Representational Stability},
author = {Raju, Prashant C.},
year = {2026},
publisher = {Zenodo},
doi = {10.5281/zenodo.18227453},
url = {https://doi.org/10.5281/zenodo.18227453},
copyright = {MIT License}
}
@article{raju2026geometric,
title = {Geometric Stability: The Missing Axis of Representations},
author = {Raju, Prashant C.},
journal = {arXiv preprint arXiv:2601.09173},
year = {2026}
}