Comprehensive benchmarking of 10 state-of-the-art SSL algorithms on ELPV solar panel defect detection
Dataset • Algorithms • Quick Start • Results
This repository provides a comprehensive benchmark of 10 semi-supervised learning (SSL) algorithms on the ELPV (Electroluminescence Photovoltaic) solar panel defect detection dataset. Our framework enables systematic evaluation of SSL techniques across 9 label-efficiency points (10 to 2099 labels), providing insights into algorithm performance under extreme label scarcity.
- 10 SSL Algorithms: 6 consistency-based methods + 4 imbalanced-aware techniques
- 9-Point Label Efficiency Analysis: From 10 labels (0.48%) to 2099 labels (fully supervised)
- Automated Benchmarking: Single-command execution for comprehensive experiments
- GPU/CPU Auto-Detection: Seamless execution on available hardware
- Professional Logging: Detailed tracking of experiments, metrics, and checkpoints
- Industrial Application: Real-world solar panel quality inspection task
The Electroluminescence (EL) imaging dataset contains solar panel cell images captured under EL conditions to detect manufacturing defects and degradation patterns.
Dataset Statistics:
- Total Images: 2,624 solar panel cells
- Training Set: 2,099 images
- Test Set: 525 images
- Classes: 2 (Functional vs. Defective)
- Class Distribution: 68.7% Functional, 31.3% Defective
- Image Size: 96×96 pixels (grayscale)
- Task: Binary classification for quality inspection
Why SSL for Solar Panels?
Manual labeling of solar panel defects requires expert knowledge and is expensive. SSL techniques can leverage the abundant unlabeled data from manufacturing lines, reducing annotation costs while maintaining high accuracy for automated quality control.
Our benchmark evaluates 10 state-of-the-art semi-supervised learning algorithms:
-
FixMatch (Sohn et al., NeurIPS 2020)
- Combines consistency regularization with pseudo-labeling
- Uses weak-to-strong augmentation with confidence thresholding
- Standard: T=0.5, p_cutoff=0.95
-
FlexMatch (Zhang et al., NeurIPS 2021)
- Adaptive threshold for pseudo-labels based on learning status
- Curriculum pseudo-labeling with per-class flexibility
- Features: Dynamic threshold warmup
-
FreeMatch (Wang et al., ICLR 2023)
- Self-adaptive confidence thresholding without manual tuning
- Entropy minimization for better decision boundaries
- Key params: ema_p=0.999, ent_loss_ratio=0.001
-
SoftMatch (Chen et al., ICLR 2023)
- Soft pseudo-labels with truncated Gaussian distributions
- Distribution alignment for label efficiency
- Features: Distributional alignment + uniform prior
-
UDA (Unsupervised Data Augmentation) (Xie et al., NeurIPS 2020)
- Consistency training with strong augmentations
- Training Signal Annealing (TSA) for balanced learning
- Config: T=0.4, p_cutoff=0.8
-
Mean Teacher (Tarvainen & Valpola, NeurIPS 2017)
- Teacher-student framework with EMA updates
- Consistency between student and teacher predictions
- High unlabeled loss ratio: 50.0
-
ABC (Adjusting Bias in Calibration) (Lee et al., NeurIPS 2021)
- Addresses confirmation bias in imbalanced SSL
- Adjustable confidence threshold per class
- Builds on FixMatch with bias calibration
-
DARP (Distribution Alignment & Random Perturbation) (Kim et al., NeurIPS 2020)
- Re-balancing via distribution alignment
- Random perturbation for diversity
- Features: 200-epoch warmup period
-
DASO (Distribution-Aware Semantics-Oriented) (Oh et al., CVPR 2022)
- Semantic alignment with queue-based memory
- Distribution-aware pseudo-labeling
- Queue length: 128 samples
-
CReST (Curriculum Pseudo-labeling with Re-Sampling) (Wei et al., ICCV 2021)
- Curriculum learning with progressive pseudo-labeling
- Re-sampling strategy for class balance
- Optional: Progressive distribution alignment
- Python 3.8+
- CUDA 11.0+ (for GPU training)
- 8GB+ RAM (16GB recommended for larger experiments)
# Clone the repository
git clone https://github.com/YaqoobAnsari/SSL-Thermal-Benchmarking.git
cd SSL-Thermal-Benchmarking/Semi-supervised-learning
# Create virtual environment
python -m venv ssl_env
source ssl_env/bin/activate # On Windows: ssl_env\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install PyTorch with CUDA support (adjust for your CUDA version)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
# Install ELPV Dataset package
pip install elpv-dataset# Test GPU availability
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"Run a quick test with FixMatch (500 iterations):
cd Semi-supervised-learning/scripts
python ssl_benchmark_elpv.py --algorithm fixmatch --labels 10 --testRun FixMatch across 4 label amounts with auto-scaled iterations:
python ssl_benchmark_elpv.py -a fixmatch -l 10,400,1600,2099 -yEstimated time: 3-3.5 hours on RTX 3060
Run comprehensive label efficiency analysis with FixMatch:
python ssl_benchmark_elpv.py -a fixmatch -l 10,25,50,100,200,400,800,1600,2099 -yEstimated time: 7-8 hours (perfect for overnight run)
This generates a complete label efficiency curve from extreme scarcity (10 labels = 0.48%) to full supervision (2099 labels = 100%).
Run all 10 SSL algorithms across selected label amounts:
python ssl_benchmark_elpv.py -a all -l 10,400,1600 -yEstimated time: 15-20 hours (recommended for multi-day run)
| Labels | % of Data | Iterations | Est. Time | Epochs |
|---|---|---|---|---|
| 10 | 0.48% | 20,480 | ~45 min | ~200 |
| 25 | 1.19% | 20,480 | ~45 min | ~200 |
| 50 | 2.38% | 20,480 | ~45 min | ~200 |
| 100 | 4.76% | 10,240 | ~25 min | ~100 |
| 200 | 9.53% | 10,240 | ~25 min | ~100 |
| 400 | 19.06% | 10,240 | ~25 min | ~100 |
| 800 | 38.11% | 5,120 | ~15 min | ~50 |
| 1600 | 76.23% | 5,120 | ~15 min | ~50 |
| 2099 | 100.00% | 5,120 | ~15 min | ~50 |
Total time per algorithm: ~7-8 hours
python ssl_benchmark_elpv.py [OPTIONS]
Options:
-a, --algorithm TEXT Algorithm(s) to run: fixmatch, flexmatch, freematch,
softmatch, uda, meanteacher, abc, darp, daso, crest, or 'all'
-l, --labels TEXT Comma-separated label amounts (e.g., '10,400,1600')
or 'all' for [10,50,100,400,1600,2099]
-i, --iterations INT Custom training iterations (default: auto-scaled)
-t, --test Quick test mode (500 iterations only)
-y, --yes Skip confirmation prompt# Single algorithm with custom iterations
python ssl_benchmark_elpv.py -a flexmatch -l 400 -i 50000
# Multiple algorithms
python ssl_benchmark_elpv.py -a fixmatch,flexmatch,freematch -l 10,400 -y
# Imbalanced-aware algorithms
python ssl_benchmark_elpv.py -a abc,darp,daso,crest -l 400 -y
# Full benchmark (all algorithms, all standard label amounts)
python ssl_benchmark_elpv.py -a all -l all -ySemi-supervised-learning/
├── config/elpv/benchmark/ # Generated YAML configs
│ ├── fixmatch_elpv_10_20k_0.yaml
│ ├── fixmatch_elpv_400_10k_0.yaml
│ └── ...
├── saved_models/elpv_benchmark/ # Trained models and metrics
│ ├── fixmatch_elpv_10_20k_0/
│ │ ├── latest_model.pth # Latest checkpoint
│ │ ├── model_best.pth # Best accuracy checkpoint
│ │ ├── log.txt # Training log
│ │ ├── results.pkl # Python dict with all metrics
│ │ └── tb_logs/ # TensorBoard logs
│ └── ...
└── experiment_logs/elpv_benchmark/ # Experiment tracking
├── benchmark_tracker.json # Master experiment tracker
├── fixmatch_elpv_10_20k_0.log # Detailed execution log
└── ...
Real-time Console Output:
[2025-11-10 18:43:42] 50 iterations - train_loss: 0.6108, run_time: 0.14s/iter
[2025-11-10 18:43:53] 100 iterations - eval_acc: 65.14%
Check Best Accuracy:
# View training log
cat saved_models/elpv_benchmark/fixmatch_elpv_10_20k_0/log.txt | grep "eval_acc"
# Check experiment tracker
cat experiment_logs/elpv_benchmark/benchmark_tracker.jsonTensorBoard Visualization:
tensorboard --logdir=saved_models/elpv_benchmark/fixmatch_elpv_10_20k_0/tb_logsThe results.pkl file contains:
eval/top-1-acc: Best test accuracyeval/balanced_acc: Balanced accuracy (important for imbalanced data)train/sup_loss: Supervised losstrain/unsup_loss: Unsupervised losstrain/total_loss: Combined losstrain/mask_ratio: Percentage of pseudo-labels above threshold
Load Results in Python:
import pickle
with open('saved_models/elpv_benchmark/fixmatch_elpv_10_20k_0/results.pkl', 'rb') as f:
results = pickle.load(f)
print(f"Best Accuracy: {results['eval/top-1-acc']:.2f}%")
print(f"Balanced Accuracy: {results['eval/balanced_acc']:.2f}%")SSL-Thermal-Benchmarking/
├── Semi-supervised-learning/
│ ├── train.py # Core training script (called by benchmark)
│ ├── scripts/
│ │ └── ssl_benchmark_elpv.py # Master benchmark runner
│ ├── semilearn/
│ │ ├── algorithms/ # SSL algorithm implementations
│ │ │ ├── fixmatch/
│ │ │ ├── flexmatch/
│ │ │ ├── freematch/
│ │ │ ├── softmatch/
│ │ │ ├── uda/
│ │ │ ├── meanteacher/
│ │ │ └── ...
│ │ ├── imb_algorithms/ # Imbalanced SSL algorithms
│ │ │ ├── abc/
│ │ │ ├── darp/
│ │ │ ├── daso/
│ │ │ └── crest/
│ │ ├── core/ # Core training infrastructure
│ │ ├── datasets/ # Dataset loaders
│ │ │ └── cv_datasets/
│ │ │ └── elpv.py # ELPV dataset implementation
│ │ └── nets/ # Network architectures
│ │ └── wrn.py # WideResNet backbone
│ ├── config/ # Configuration files
│ └── data/ # Dataset storage
└── README.md # This file
Modify the network in config or via command:
# In ssl_benchmark_elpv.py, update _get_base_config():
config['net'] = 'resnet18' # Options: wrn_28_2, wrn_28_8, resnet18, resnet50Edit algorithm parameters in ssl_benchmark_elpv.py:
SSL_ALGORITHMS = {
'fixmatch': {
'hard_label': True,
'T': 0.5, # Temperature for sharpening
'p_cutoff': 0.95, # Confidence threshold
},
}For multi-GPU setups, modify the config:
config['gpu'] = '0,1,2,3' # Use 4 GPUsTraining automatically saves checkpoints. To resume:
# The framework auto-detects and loads 'latest_model.pth'
python train.py --c config/elpv/benchmark/fixmatch_elpv_400_10k_0.yamlSolution: The benchmark script automatically sets num_workers=0 for Windows compatibility. If using custom configs, ensure:
num_workers: 0 # Critical for WindowsSolution: Reduce batch size in ssl_benchmark_elpv.py:
# _get_base_config() method
batch_size = 8 # Reduce from 16 or 32Solution: Enable mixed precision training:
config['amp'] = True # Automatic Mixed PrecisionCheck GPU utilization:
nvidia-smi -l 1 # Monitor GPU usageIncrease batch size if GPU underutilized:
batch_size = 64 # If you have 12GB+ VRAMThis is expected with extreme label scarcity (10-50 labels). Consider:
- Increasing training iterations
- Adjusting confidence threshold (p_cutoff)
- Trying different augmentation strengths
If you use this benchmark in your research, please cite:
@misc{elpv_ssl_benchmark2025,
title={Semi-Supervised Learning Benchmark for Solar Panel Defect Detection},
author={Ansari, Yaqoob},
year={2025},
howpublished={\url{https://github.com/YaqoobAnsari/SSL-Thermal-Benchmarking}}
}Corresponding Author: Yaqoob Ansari (yansari@student.unimelb.edu.au)
Original ELPV Dataset:
@article{buerhop2018reliability,
title={A benchmark dataset for defect classification in photovoltaic modules},
author={Buerhop-Lutz, Claudia and Deitsch, Sergiu and Maier, Andreas and others},
journal={Solar Energy},
volume={161},
pages={87--94},
year={2018}
}USB Framework (Base Implementation):
@article{wang2022usb,
title={USB: A Unified Semi-supervised Learning Benchmark for Classification},
author={Wang, Yidong and Chen, Hao and Heng, Quan and others},
journal={Neural Information Processing Systems (NeurIPS)},
year={2022}
}This project is licensed under the MIT License - see the LICENSE file for details.
Original USB framework: Copyright (c) Microsoft Corporation. Licensed under the MIT License.
- USB Framework: Built on the Unified Semi-supervised Learning Benchmark by Microsoft Research
- ELPV Dataset: From the ELPV-Dataset repository
- SSL Community: For the pioneering work on FixMatch, FlexMatch, FreeMatch, SoftMatch, and other algorithms
Ready to benchmark? Start with the 9-point overnight run:
python ssl_benchmark_elpv.py -a fixmatch -l 10,25,50,100,200,400,800,1600,2099 -y