Detect and defend against data poisoning attacks on machine learning models.
Protect your ML training pipelines from malicious data injection and model manipulation.
- Multi-Vector Detection: Detect backdoor, label flip, gradient, and feature poisoning
- Training Data Analysis: Analyze datasets for suspicious samples
- Defense Mechanisms: Apply multiple defense strategies
- Risk Scoring: Calculate poisoning risk scores
- Real-time Protection: Fast detection for production use
git clone https://github.com/hallucinaut/modelpoison.git
cd modelpoison
go build -o modelpoison ./cmd/modelpoison
sudo mv modelpoison /usr/local/bin/go install github.com/hallucinaut/modelpoison/cmd/modelpoison@latest# Detect poisoning in training data
modelpoison detect training_data.csv
# Analyze security
modelpoison analyze# Defend model against poisoning
modelpoison defend training_data.csv
# Get recommendations
modelpoison recommendpackage main
import (
"fmt"
"github.com/hallucinaut/modelpoison/pkg/detect"
"github.com/hallucinaut/modelpoison/pkg/defend"
)
func main() {
// Create detector
detector := detect.NewDetector()
// Detect poisoning
result := detector.Detect(samples)
fmt.Printf("Poisoned samples: %d\n", result.PoisonedCount)
fmt.Printf("Risk Score: %.0f%%\n", result.RiskScore*100)
// Apply defense
defender := defend.NewDefender()
defense := defender.Defend(result.RiskScore, "Data Cleaning")
fmt.Printf("Defense Success: %v\n", defense.Success)
fmt.Printf("Risk Reduction: %.0f%%\n", defense.RiskReduction*100)
}Inject malicious triggers:
- Visual patterns in images
- Specific words in text
- Trigger sequences in time series
Corrupt training labels:
- Random label noise
- Targeted label changes
- Consistent mislabeling
Manipulate training gradients:
- Byzantine attacks
- Coordinate poisoning
- Gradient compression attacks
Corrupt input features:
- Feature manipulation
- Statistical outliers
- Distribution shifts
Inject malicious data:
- Malicious samples
- Distribution poisoning
- Concept drift attacks
| Strategy | Effectiveness | Overhead | Use Case |
|---|---|---|---|
| Adversarial Training | 85% | 40% | High-security training |
| Ensemble Defense | 90% | 50% | Critical applications |
| Robust Aggregation | 80% | 15% | Distributed training |
| Data Cleaning | 75% | 20% | General use |
| Input Filtering | 70% | 10% | Real-time protection |
| Outlier Detection | 65% | 12% | Quick defense |
| Score | Level | Action |
|---|---|---|
| 0-10% | MINIMAL | Monitor |
| 10-30% | LOW | Review data |
| 30-50% | MEDIUM | Clean data |
| 50-70% | HIGH | Investigate |
| 70-100% | CRITICAL | Block training |
# Run all tests
go test ./...
# Run with coverage
go test -cover ./...
# Run specific test
go test -v ./pkg/detect -run TestDetectPoisoningDetecting poisoning in: training_data.csv
=== Model Poisoning Detection Report ===
Total Samples: 1000
Poisoned Samples: 15
Risk Score: 15%
Method: ensemble_detection
Detected Poisoned Samples:
[1] backdoor
ID: sample_001
Type: backdoor
Score: 78%
Description: Potential backdoor trigger detected
Evidence: Unusual feature pattern
⚠️ POISONING DETECTED
Recommendation: Clean training data before training
- ML Pipeline Security: Protect training data from poisoning
- Model Integrity: Ensure trained models are clean
- Data Quality Assurance: Validate training datasets
- AI Supply Chain Security: Secure ML data pipelines
- Compliance: Meet AI security requirements
- Validate training data before training
- Monitor for poisoning during training
- Use multiple defenses for critical systems
- Test models for backdoor behavior
- Regular security audits of ML pipelines
- Implement data versioning for reproducibility
modelpoison/
├── cmd/
│ └── modelpoison/
│ └── main.go # CLI entry point
├── pkg/
│ ├── detect/
│ │ ├── detect.go # Detection logic
│ │ └── detect_test.go # Unit tests
│ └── defend/
│ ├── defend.go # Defense mechanisms
│ └── defend_test.go # Unit tests
└── README.md
MIT License
- Machine learning security research community
- Adversarial machine learning researchers
- AI safety practitioners
Built with GPU by hallucinaut