-
Updated
Apr 27, 2025 - Python
#
c4-dataset
Here are 2 public repositories matching this topic...
A from-scratch implementation of a T5 model modified with Rotary Position Embeddings (RoPE). This project includes the code for pre-training on the C4 dataset in streaming mode with Flash Attention 2.
nlp pytorch sequence-to-sequence language-model from-scratch rope pre-training huggingface t5 evaluation-benchmark llm rotary-position-embedding flash-attention c4-dataset span-corruption
-
Updated
Jul 9, 2025 - Python
Improve this page
Add a description, image, and links to the c4-dataset topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the c4-dataset topic, visit your repo's landing page and select "manage topics."