Concepts from Neurons: Building Interpretable Medical Image Diagnostic Models by Dissecting Opaque Neural Networks
Implementation for IPMI 2025 paper Concepts from Neurons: Building Interpretable Medical Image Diagnostic Models by Dissecting Opaque Neural Networks by Shizhan Gong, Huayu Wang, Xiaofan Zhang, and Qi Dou.
Figure: Overview of the proposed framework.
Figure: Examples of explanations provided by our method. For each input image, we show top-4 concepts and their contributions to the logits of the correct labels. We also present the corresponding reports of Harvard-FairVLMed and MIMIC-CXR for references. Some descriptions of the normal findings are omitted.
We recommend to install the environment through pip:
pip install -r requirements.txt
We use three datasets to evaluate our method:
- HAM10000: The dataset can be accessed via this link.
- Harvard-FairVLMed: The dataset can be accessed via this link.
- MIMIC-CXR: The dataset can be accessed via this link. The original dataset is of extremely large size. Therefore, we utilized a cleaned version provided in this link.
The train/val/test split can be found in the folder /split. To use the file, put the split file under the data directory.
To pre-train the opaque models with the corresponding dataset, using the following command
python train_opaque.py --data ham --epoch 200 --backbone densenet --trial 1 --root address/to/data/folder
--ckpt_dir address/to/ckpt/folder
--data specifies the dataset used for training, can be one of ham, mimic, and fundus. --backbone specifies
the backbone of the network, can be one of densenet, resnet, convnext, and vit. --epoch denotes the number
of epochs for training. --trial is the index of the experimental trail. --root specifies the data directory.
--ckpt_dir specifies the directory to save the trained checkpoints.
First run the following code to obtain the visual representations of the images from the training data
python encode_embedding.py --data ham --backbone densenet --trial 1 --root path/to/data --result_dir /path/to/save/visual representation
--data specifies the dataset used for training, can be one of ham, mimic, and fundus. --backbone specifies
the backbone of the network, can be one of densenet, resnet, convnext, and vit. --trial is the index of the experimental trail. --root specifies the data directory.
--result_dir specifies the directory to save the pre-calculated visual representation
Then train the SAE based on the pre-calculated visual representations
python SAE/train_sae.py --data ham --backbone densenet --trial 1 --save_dir path/to/visual representation
--save_dir specifies the directory to save the trained checkpoints.
Please refer to the jupyter notebook file name_concept.ipykernel for the prompt to name the concepts and the code to
obtain the CAV for each concept.
Run the following command
python submodule_optimization.py --data ham --backbone densenet --trial 1 --num_concept 800 --result_dir /path/to/save
Please run the following code
python train_CBM.py --data ham --backbone densenet --trial 1 --num_concept 800 --result_dir /path/to/save/visual representation
--num_concept specifies the number of concepts used to train the CBM.
If you find this work helpful, you can cite our paper as follows:
@article{gong2025concepts,
title={Concepts from Neurons: Building Interpretable Medical Image Diagnostic Models by Dissecting Opaque Neural Networks},
author={Gong, Shizhan and Wang, Huayu and Zhang, Xiaofan and Dou, Qi},
journal={Information Processing in Medical Imaging},
year={2025}
}
For any questions, please contact szgong22@cse.cuhk.edu.hk

