Colab-friendly BitNet distillation engine: collect KD traces from a teacher, train a ternary Mini-BitNet, and dry-run 7B memory. Multi-provider + Drive/S3
          transformers          pytorch          parquet          knowledge-distillation          model-compression          bitnet          colab-notebook          mixed-precision          ternary-quantization          large-language-model          efficient-llm          efficient-llm-inference          activation-quantization          kd-traces      
    - 
            Updated
            
Sep 5, 2025  - Python