Each .jsonl file contains a list of prompt-response pairs, one per line.
If you're using this as a base for your own fine-tuning:
- Clone the repo
- Select or customize subcategories
- Feed
.jsonlfiles into your LoRA training pipeline (e.g.,peft,Axolotl,QLoRA) - Validate results with your model using structured tests or interactive review
- Include test coverage datasets (e.g., expected model completions)
- Add platform-specific code distinctions (e.g., ESP32 vs. Raspberry Pi Pico)
- Link to model outputs and benchmarks for comparison
π Contributions This is starting as a personal project, but it would be cool if this became a public endeavor. I'd love for the effort to help more than just myself.
I welcome pull requests to improve this repo. I only ask for your patience as I'm still learning β at this time, ChatGPT gets a lot of credit as the brains behind this operation!