Minenik2 / deepseek-v3-from-scratch-in-python Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Deepseek V3.1 became the best non-reasoning model in february 2025, this is a recreation based on the paper

0 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitattributes		.gitattributes
deepseek.py		deepseek.py
readme.md		readme.md

Repository files navigation

Deepseek V3.1 became the best non-reasoning model in february 2025, this is a recreation based on the paper

paper - https://arxiv.org/pdf/2412.19437

Learning:

Multihead latent attention
- attention basics
- RoPE
- MLA
Mixture of experts
- Gate
- expert
Parallelization across GPUs

About

Deepseek V3.1 became the best non-reasoning model in february 2025, this is a recreation based on the paper

Report repository

Releases

No releases published

Packages

No packages published

Languages

Python 100.0%