Datasets for Instruction Tuning of Large Language Models
-
Updated
Nov 30, 2023
Datasets for Instruction Tuning of Large Language Models
agentic data generation(under refactor!!!)
Proxy server that automatically stores messages exchanged between any OAI-compatible frontend and backend as a ShareGPT dataset to be used for training/finetuning.
Exports a chat as a ShareGPT dataset
Genshin Impact Character Chat Models tuned by Lora on LLM
Fork of GeoAnima's Claude.ai chat exporter userscript, improving button UI and exporting directly to ShareGPT-format JSON
Deepseek-Dataset-Generator creates conversational datasets for LLM fine-tuning via DeepSeek API. Supports various formats (ChatML, ShareGPT, Alpaca, JSON, CSV), easy configuration via YAML and detailed logs. Ideal for generating realistic and customized data quickly.
A JSON viewer/editor for multi-line string values - allows to render and edit strings in plain mode (handles escaping/unescaping). Ideal for editing ShareGPT or Alpaca type LLM training examples.
Add a description, image, and links to the sharegpt topic page so that developers can more easily learn about it.
To associate your repository with the sharegpt topic, visit your repo's landing page and select "manage topics."