Machine-verified integrity for every fixed model artifact — weights, vocabularies, quantization tables, and tokenizers.
Model Asset Guard provides formal verification and runtime validation for machine learning model artifacts, ensuring their integrity, correctness, and reliability throughout the model lifecycle.
- Overview
- Features
- Architecture
- Quick Start
- Installation
- Usage
- API Reference
- Development
- Contributing
- License
Model Asset Guard addresses critical challenges in machine learning model deployment by providing:
- Formal Verification: Lean 4 specifications for mathematical correctness
- Runtime Validation: High-performance Rust sidecar for integrity checks
- Cross-Platform Support: Python and Node.js bindings for seamless integration
- Production Ready: Zero unsafe code, comprehensive testing, and CI/CD integration
Modern ML pipelines face significant challenges with model artifact integrity:
- Bit Corruption: Silent failures from storage or transmission errors
- Quantization Errors: Unbounded error accumulation in quantized models
- Tokenizer Inconsistencies: Non-deterministic encoding/decoding behavior
- Version Drift: Undetected changes in model artifacts
Model Asset Guard provides a comprehensive solution through:
- SHA-256 Integrity Validation with formal proofs
- Quantization Error Bounds with mathematical guarantees
- Tokenizer Determinism Verification with fuzz testing
- High-Performance Runtime Validation via Rust sidecar
- Seamless Integration with existing ML frameworks
| Feature | Description | Technology |
|---|---|---|
| Weight Integrity | SHA-256 validation with formal proofs | Lean 4 + Rust |
| Quantization Bounds | Mathematical error bounds for int8/FP16 | Lean 4 |
| Tokenizer Verification | Determinism and surjectivity proofs | Lean 4 |
| Runtime Validation | High-performance integrity checks | Rust |
| Framework Integration | HuggingFace, PyTorch, Node.js support | Python/Node.js |
- Startup Overhead: < 10ms for integrity checks
- Memory Footprint: < 1MB binary size
- Zero Heap Allocations: Deterministic memory usage
- 100% Bit-Flip Detection: Comprehensive corruption detection
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Lean 4 Specs │ │ Rust Sidecar │ │ Language Bindings│
│ │ │ │ │ │
│ • Formal Proofs │◄──►│ • High Perf │◄──►│ • Python │
│ • Math Bounds │ │ • Zero Unsafe │ │ • Node.js │
│ • Verification │ │ • FFI Interface │ │ • HuggingFace │
└─────────────────┘ └─────────────────┘ └─────────────────┘
model-asset-guard/
├── src/ # Source code
│ ├── lean/ # Lean 4 specifications
│ │ ├── ModelAssetGuard/ # Core library
│ │ └── cli/ # Command-line tools
│ ├── rust/ # Rust sidecar
│ │ └── guardd/ # High-performance library
│ └── python/ # Python source code
├── bindings/ # Language bindings
│ ├── python/ # Python bindings
│ └── nodejs/ # Node.js bindings
├── tests/ # Test suite
│ ├── unit/ # Unit tests
│ ├── integration/ # Integration tests
│ ├── e2e/ # End-to-end tests
│ └── performance/ # Performance tests
├── scripts/ # Build and utility scripts
- Lean 4:
nightly-2024-01-15or later - Rust:
1.70+with Cargo - Python:
3.8+(for bindings and testing) - Node.js:
16.0+(for Node.js bindings)
# Clone the repository
git clone https://github.com/SentinelOps-CI/model-asset-guard.git
cd model-asset-guard
# Build Lean specifications
lake build
# Build Rust sidecar
cargo build --release --manifest-path src/rust/guardd/Cargo.toml
# Install Python bindings (optional)
pip install -e bindings/python/
# Run verification tests
lake test
cargo test --manifest-path src/rust/guardd/Cargo.toml# Verify a checkpoint file
lake exe verifyweights /path/to/checkpoint.bin
# Verify with expected digest
lake exe verifyweights /path/to/checkpoint.bin --digest abc123...# Generate bounds for ONNX model
lake exe quantbound /path/to/model.onnx --output bounds.json
# Verify quantization error
lake exe quantbound /path/to/model.onnx --verify --tolerance 1e-6# Test BPE tokenizer
lake exe tokenizertest /path/to/tokenizer.json --type bpe
# Fuzz test with 1M random strings
lake exe tokenizertest /path/to/tokenizer.json --fuzz 1000000from bindings.python.pytorch_guard import ModelAssetGuard, checked_load_pretrained
# Initialize guard
guard = ModelAssetGuard()
# Verify file integrity
verified = guard.verify_digest("model.bin", expected_digest)
# Load model with verification
model_info = guard.checked_load("model.bin", expected_digest)
# HuggingFace integration
from bindings.python.pytorch_guard import HuggingFaceGuard
hf_guard = HuggingFaceGuard(guard)
model, tokenizer, info = hf_guard.checked_load_pretrained(
"gpt2",
verify_quantization=True
)const { ModelAssetGuard } = require("./bindings/nodejs/node_guard.js");
// Initialize guard
const guard = new ModelAssetGuard();
// Verify model integrity
const verified = guard.verifyDigest("model.bin", expectedDigest);
// Load model with verification
const modelInfo = guard.checkedLoad("model.bin", expectedDigest);use guardd::{checked_load, verify_digest};
// Verify file integrity
let verified = verify_digest("model.bin", &expected_digest)?;
// Load model with verification
let model_info = checked_load("model.bin", Some(&expected_digest))?;Verifies SHA-256 digest of a file.
Parameters:
path: Path to the file to verifyexpected_digest: Expected SHA-256 digest (32 bytes)
Returns: true if digest matches, false otherwise
Loads and validates a model file.
Parameters:
path: Path to the model fileexpected_digest: Optional expected digest for validation
Returns: Model information including path, size, and digest
verify_quantization_128_vectors(layer_name: &str, weights: &[f32], fan_in: u32, fan_out: u32, quant_type: &str) -> Result<QuantizationResult, GuarddError>
Verifies quantization bounds for a layer.
Parameters:
layer_name: Name of the layerweights: Weight matrix as flat arrayfan_in: Input dimensionfan_out: Output dimensionquant_type: Quantization type ("int8", "fp16")
Returns: Quantization verification results
#[derive(Debug)]
pub enum GuarddError {
FileNotFound(String),
InvalidDigest(String),
QuantizationError(String),
IoError(std::io::Error),
}# Clone with submodules
git clone --recursive https://github.com/fraware/model-asset-guard.git
cd model-asset-guard
# Build all components
make build
# Run all tests
make test
# Run benchmarks
make benchmark# Setup development environment
make setup-dev
# Run linting
make lint
# Run security audit
make security-audit
### Testing
```bash
# Run unit tests
lake test
cargo test --manifest-path src/rust/guardd/Cargo.toml
# Run integration tests
python tests/e2e/test_huggingface_integration.py
# Run performance tests
lake exe benchmarksThe project includes comprehensive CI/CD pipelines:
- Multi-platform Testing: Ubuntu, macOS, Windows
- Security Scanning: Cargo audit, unsafe code detection
- Performance Benchmarks: Automated performance regression testing
- Documentation Generation: Automated API documentation
We welcome contributions from the community! Please see our Contributing Guidelines for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Lean 4: Follow Lean 4 style guidelines
- Rust: Follow Rust coding standards, zero unsafe code
- Python: Follow PEP 8, type hints required
- Tests: Maintain >90% test coverage
- All new features must include unit tests
- Integration tests for cross-language functionality
- Performance benchmarks for critical paths
- Security tests for all external interfaces
This project is licensed under the MIT License - see the LICENSE file for details.
- Lean Community: For the formal verification framework
- Rust Community: For the high-performance systems programming language
- HuggingFace: For the transformers library integration
- Academic Contributors: For mathematical foundations and proofs
If you use Model Asset Guard in your research, please cite:
@software{model_asset_guard,
title={Model Asset Guard: Machine-verified integrity for ML artifacts},
author={Model Asset Guard Contributors},
year={2025},
url={https://github.com/fraware/model-asset-guard}
}- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Security: Security Policy
Model Asset Guard - Ensuring the integrity of machine learning artifacts through formal verification and runtime validation.