- Fork the repository and clone your fork
- Install dependencies:
pip install -r requirements.txt - Tests run in Docker only (see below)
- Python: PEP 8, type hints in function signatures
- No unnecessary comments: Only where logic is not self-evident
- Hard asserts: No silent recovery or defensive error handling
- No
.get(),getattr(),hasattr(): Let missing keys throwKeyError/AttributeError
- Standard library:
import xyz - Third-party:
from xyz import q - Local/repo:
from models.foo import Bar
Within each group, import x lines come before from x import y lines.
Never import from the same package on separate lines:
# Bad
from torch import nn
from torch import optim
# Good
from torch import nn, optim
# For many names, use parenthesized form
from transformers import (
AutoConfig,
AutoModel,
AutoModelForMaskedLM,
PreTrainedModel,
)fastplms/new_model/
__init__.py
modeling_new_model.py # PreTrainedModel + PretrainedConfig
get_new_model_weights.py # Weight conversion from official checkpoint
README.md # HuggingFace model card README
LICENSE # Model license
testing/official/new_model.py # Load official model for compliance testing
Your modeling_*.py should:
- Subclass
PreTrainedModelandEmbeddingMixin - Define a
PretrainedConfigsubclass withattn_backendattribute - Implement the
AttentionBackendenum and backend resolution - Implement
_embed(input_ids, attention_mask)returning last hidden states - Register in
config.jsonviaauto_map:
{
"auto_map": {
"AutoConfig": "modeling_new_model.NewModelConfig",
"AutoModelForMaskedLM": "modeling_new_model.NewModelForMaskedLM"
}
}get_*_weights.py should:
- Load the official checkpoint
- Remap parameter names to match your architecture
- Export
config.json,pytorch_model.bin, and modeling source files - The output directory can be pushed to HuggingFace
testing/official/new_model.py should expose:
def load_official_model(reference_repo_id: str, device: torch.device, dtype: torch.dtype):
# Load and wrap the official model
# Return (wrapped_model, tokenizer) where wrapped_model has .logits and .hidden_states outputs
...Add your model to testing/conftest.py:
# In MODEL_REGISTRY (for fast CI, pick the smallest checkpoint)
"new_model": {
"fast_path": "Synthyra/NewModel-150M",
"official_path": "org/official-model",
"load_official": "testing.official.new_model",
"model_type": "NewModel",
"uses_tokenizer": True,
},
# In FULL_MODEL_REGISTRY (all checkpoints with size_category)
"new_model_150m": {
"fast_path": "Synthyra/NewModel-150M",
"official_path": "org/official-model-150m",
"load_official": "testing.official.new_model",
"model_type": "NewModel",
"uses_tokenizer": True,
"size_category": "small",
},Create fastplms/new_model/README.md with the HuggingFace model card content and fastplms/new_model/LICENSE with the model license.
Add entries for pushing your model's files to the Hub.
All tests must run in Docker. Never run tests natively on Windows (missing Triton, flash-attention, CUDA kernels).
# Build
docker build -t fastplms .
# Run your model's tests
docker run --gpus all fastplms python -m pytest /app/testing/ -k new_model -v
# Run all tests
docker run --gpus all fastplms python -m pytest /app/testing/ -vBefore submitting a PR for a new model, ensure:
test_automodel_loadsandtest_automodel_forward_passpasstest_backend_consistencypasses for all available backendstest_nan_stabilitypassestest_batch_single_matchpasses (tokenizer-mode models)test_weight_compliancepasses (if compliance deps are available)test_forward_compliancepasses (if compliance deps are available)
Found a bug or have a feature request? Open a GitHub Issue.