VS Code extension to easily view and handle large datasets. Look at JSONL/Parquet/CSV files without crashes + 16 production LLM tokenizers for chat completion data
-
Updated
Feb 5, 2026 - TypeScript
VS Code extension to easily view and handle large datasets. Look at JSONL/Parquet/CSV files without crashes + 16 production LLM tokenizers for chat completion data
Fine-tuning DistilGPT2 on the EmpatheticDialogues dataset to create an emotionally intelligent chatbot. Features custom attention calibration and a Streamlit-based interface for wellness support.
Production-style PySpark ETL pipeline processing 100K+ e-commerce records with optimized joins, feature engineering, and scalable Parquet outputs.
Add a description, image, and links to the paraquet topic page so that developers can more easily learn about it.
To associate your repository with the paraquet topic, visit your repo's landing page and select "manage topics."