Enhancing Efficiency and Performance in Deepfake Audio Detection through Neuron-level dropin & Neuroplasticity Mechanisms
This repository contains the official implementation of all experiment mentioned in Enhancing Efficiency and Performance in Deepfake Audio Detection through Neuron-level dropin Neuroplasticity Mechanisms, more details can be found in the paper.
Dropin/
├── README.md
├── 0. Dataset/
│ ├── ASVspoof2019LA.py # Dataset script for ASVspoof 2019 LA dataset
│ ├── ASVspoof2019PA.py # Dataset script for ASVspoof 2019 PA dataset
│ ├── FoR.py # Dataset script for FoR dataset
├── 1. ResNet/
│ ├── 1.Baseline+Dropin frozen+Dropin unfrozen/
│ │ ├── main.py # Training script for all ResNet experiments
│ │ └── resnet.py # ResNet18 model with DropIn support
│ └── 2.Plasticity/
│ ├── main.py # Plasticity analysis experiments
│ └── resnet.py # ResNet model for plasticity tests
│
├── 2. GRNN/
│ ├── 1. Baseline+Dropin frozen+Dropin unfrozen/
│ │ ├── main.py # Main training script
│ │ ├── models/
│ │ │ ├── __init__.py
│ │ │ └── lc_grnn.py # LC-GRNN model with DropIn support
│ │ └── utils/
│ │ ├── __init__.py
│ │ ├── dataset.py # Dataset script used, ASVspoof 2019 LA by default
│ │ └── training.py # Training utilities
│ └── 2. Plasticity/
│ ├── main.py # Plasticity experiments
│ ├── lc_grnn.py # GRNN model
│ ├── dataset.py # Dataset script
│ └── training.py # Training functions
│
├── 3. Wav2Vec 2.0/
│ ├── 1. Baseline/
│ │ ├── main.py # Baseline Wav2Vec 2.0 training
│ │ └── dataset.py # Dataset script
│ ├── 2. Dropin frozen+Dropin unfrozen/
│ │ ├── main.py # DropIn training script
│ │ ├── dropin.py # DropIn Wav2Vec 2.0 model
│ │ └── dataset.py # Dataset script
│ ├── 3. Plasticity/
│ │ ├── main.py # Plasticity experiments
│ │ ├── model.py # Custom Wav2Vec 2.0 model
│ │ ├── dropin.py # DropIn implementation
│ │ └── dataset.py # Dataset script
│ └── 4. LoRA/
│ ├── main.py # LoRA baseline comparison
│ └── dataset.py # Dataset script
│
└── 4. Wav2Vec 2.0 Small/
├── 1. Baseline/
│ ├── main.py # Baseline training
│ └── dataset.py # Dataset script
├── 2. Dropin frozen+Dropin unfrozen/
│ ├── main.py # DropIn training
│ ├── dropin.py # DropIn model
│ └── dataset.py # Dataset script
├── 3. Plasticity/
│ ├── main.py # Plasticity experiments
│ └── dataset.py # Dataset script
└── 4. LoRA/
├── main.py # LoRA comparison
└── dataset.py # Dataset script
pip install torch>=1.10.0
pip install torchaudio>=0.10.0
pip install transformers>=4.20.0
pip install peft>=0.4.0
pip install scikit-learn>=1.0.0
pip install numpy>=1.21.0
pip install pandas>=1.3.0
pip install matplotlib>=3.4.0
pip install seaborn>=0.11.0
pip install tqdm>=4.62.0
pip install pynvml>=11.0.0-
GPU with CUDA support (recommended: NVIDIA GPU with ≥16GB VRAM)
-
The minimal amount of GPU VRAM required for each model is as follows:
Model Minimal GPU VRAM required Resnet 1 GB GRNN 8 GB Wav2Vec 2.0 12 GB Wav2Vec 2.0 small 10 GB
All experiments are conducted on the ASVspoof 2019 LA (Logical Access), ASVspoof 2019 PA(Physical Access), FoR(Fake or Real) dataset for audio deepfake detection.
- Download and extract the dataset from the official website
- Update the
data_rootpath in the respectivemain.pyfiles
In order to run a specific experiment, the following configuration is required:
- Find and
cdinto the folder of that model and experiment - Find and paste the corresponding dataset script in
0. Datasetintodataset.pyand/or the corresponding function insidemain.py. - Set up parameters based on your requirements in
main.py, this may contains dataset root, learning rate, epochs, etc. - Run
python main.py.
- Accuracy: Classification accuracy
- EER (Equal Error Rate): Standard metric for speaker verification
- Confusion Matrix: Detailed classification performance
Results are automatically saved to the results/ directory in each experiment folder, including:
-
Training/validation loss curves
-
EER progression over epochs
-
Confusion matrices
-
Model checkpoints
-
Timing and memory usage statistics