🎓 Focus: Architecture-aware TPU programming
This repository emphasizes how TPU hardware architecture, XLA/HLO compilation, and kernel-level design interact, rather than only API usage.
A comprehensive tutorial repository covering TPU (Tensor Processing Unit) programming and architecture. This repository provides educational materials, hands-on tutorials, code examples, and resources for developers, researchers, and students interested in learning TPU programming.
- An in-depth look at Google’s first Tensor Processing Unit (TPU)
- Google supercharges machine learning tasks with custom chip
- TPU Deep Dive
- Google TPU Architecture: Complete Guide to 7 Generations
- Google’s Training Chips Revealed: TPUv2 and TPUv3
- Ten Lessons: 4 TPU Generations
- TPU Datacenter Performance Analysis
- Ten Lessons From Three Generations Shaped Google’s TPUv4i (EPFL CS723)
- A Machine Learning Supercomputer With An Optically Reconfigurable Interconnect and Embeddings Support
- In-Datacenter Performance Analysis of a Tensor Processing Unit
- The Design Process for Google’s Training Chips: TPUv2 and TPUv3
- The Design Process for Google’s Training Chips: TPUv2 and TPUv3 (IEEE)
- Ten Lessons From Three Generations Shaped Google’s TPUv4i Industrial Product
- TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings
- Attention is All You Need
- Tensor Processing Units (TPU): A Technical Analysis and Their Impact on Artificial Intelligence
- Google Cloud TPU Documentation
- TensorFlow TPU Guide
- PyTorch XLA Documentation
- JAX on TPU
- Pallas: a JAX kernel language
- How to Scale Your Model | Google DeepMind
- Total Cost of Ownership - Wikipedia
- TPU Architecture: Understanding TPU hardware design, components, and performance characteristics
- TPU Programming: Programming models, APIs, and frameworks for TPU development
- Optimization Techniques: Best practices for optimizing TPU performance
- Practical Examples: Real-world applications and use cases
- Cloud TPU & Edge TPU: Working with Google Cloud TPU and Edge TPU devices
- Resources
- Getting Started
- Repository Structure
- Tutorials
- Documentation
- Examples
- Contributing
- License
- Community
- Basic understanding of machine learning and deep learning concepts
- Familiarity with Python programming
- Knowledge of TensorFlow or PyTorch (recommended)
# Clone the repository
git clone https://github.com/Zhighway777/awesome-tpu-tutorial.git
# Navigate to the repository
cd awesome-tpu-tutorial
# Explore tutorials
cd tutorials/awesome-tpu-tutorial/
├── README.md # This file
├── CONTRIBUTING.md # Contribution guidelines
├── LICENSE # License information
├── CODE_OF_CONDUCT.md # Community guidelines
├── docs/ # Documentation
│ ├── architecture/ # TPU architecture documentation
│ ├── programming-guides/ # Programming guides
│ └── api-reference/ # API references
├── tutorials/ # Step-by-step tutorials
│ ├── beginner/ # Beginner-level tutorials
│ ├── intermediate/ # Intermediate-level tutorials
│ └── advanced/ # Advanced tutorials
├── examples/ # Code examples
│ ├── tensorflow/ # TensorFlow examples
│ ├── pytorch/ # PyTorch examples
│ └── jax/ # JAX examples
└── resources/ # Additional resources
├── papers/ # Research papers
├── presentations/ # Slides and presentations
└── references/ # External references
- Introduction to TPU and its advantages
- Setting up TPU development environment
- Your first TPU program
- Basic tensor operations on TPU
- TPU memory management
- Data pipeline optimization
- Model parallelism on TPU
- Training neural networks on TPU
- Custom TPU kernels
- Performance profiling and optimization
- Large-scale distributed training
- TPU research and cutting-edge techniques
- TPU Architecture Guide: Deep dive into TPU hardware design
- Programming Guides: Comprehensive programming tutorials
- API Reference: Detailed API documentation
Browse our collection of practical examples:
- Image classification models
- Natural language processing
- Recommendation systems
- Reinforcement learning
- Custom training loops
We welcome contributions! Please see our Contributing Guide for details on:
- How to submit issues
- How to propose new tutorials
- Code style guidelines
- Pull request process
This project is licensed under the Apache 2.0 License.- see the LICENSE file for details.
Special thanks to all contributors and the TPU community for their valuable input and support.
🎓 仓库定位:面向架构与编译器视角的 TPU 编程
本仓库不仅介绍 TPU API 使用,更关注 TPU 硬件架构、XLA/HLO 编译流程、 kernel 设计与性能建模之间的关系。
这是一个TPU(张量处理单元)的编程和架构教程仓库。本仓库为开发者、研究人员和学生提供教学材料、实践教程、代码示例和学习资源。
中文互联网的朋友首先推荐ZOMI老师的AI Infra中关于Google TPU架构的演进系列
- 谷歌 TPU 发展历史以及架构演变 | ZOMI
- TPU 使用教程
- TPU深度探索(TPU Deep Dive)
- SemiAnalysis深度解读TPU--谷歌冲击“英伟达帝国”
- TPU 架构:谷歌 7 代处理器完全指南
- An in-depth look at Google’s first Tensor Processing Unit (TPU)
- Google supercharges machine learning tasks with custom chip
- TPU数据中心的性能分析
- TPU演进十年:Google的十大经验教训
- Google’s Training Chips Revealed: TPUv2 and TPUv3
- Ten Lessons From Three Generations Shaped Google’s TPUv4i (EPFL CS723)
- A Machine Learning Supercomputer With An Optically Reconfigurable Interconnect and Embeddings Support
- Ten Lessons From Three Generations Shaped Google’s TPUv4i
- Ten Lessons From Three Generations Shaped Google’s TPUv4i Industrial Product
- In-Datacenter Performance Analysis of a Tensor Processing Unit
- The Design Process for Google’s Training Chips: TPUv2 and TPUv3
- The Design Process for Google’s Training Chips: TPUv2 and TPUv3 (IEEE)
- TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings
- Attention is All You Need
- Tensor Processing Units (TPU): A Technical Analysis and Their Impact on Artificial Intelligence
- Google Cloud TPU文档
- TensorFlow TPU指南
- PyTorch XLA文档
- JAX on TPU
- Pallas: a JAX kernel language
- How to Scale Your Model | Google DeepMind
- Total Cost of Ownership - Wikipedia
- TPU架构:理解TPU硬件设计、组件和性能特征
- TPU编程:TPU开发的编程模型、API和框架
- 优化技术:TPU性能优化的最佳实践
- 实践示例:真实世界的应用和用例
- Cloud TPU与Edge TPU:使用Google Cloud TPU和Edge TPU设备
- 机器学习和深度学习基础知识
- 熟悉Python编程
- 了解TensorFlow或PyTorch(推荐)
# 克隆仓库
git clone https://github.com/Zhighway777/TPU-Programming-Tutorials.git
# 进入仓库目录
cd TPU-Programming-Tutorials
# 浏览教程
cd tutorials/TPU-Programming-Tutorials/
├── README.md # 本文件
├── CONTRIBUTING.md # 贡献指南
├── LICENSE # 许可证信息
├── CODE_OF_CONDUCT.md # 社区准则
├── docs/ # 文档
│ ├── architecture/ # TPU架构文档
│ ├── programming-guides/ # 编程指南
│ └── api-reference/ # API参考
├── tutorials/ # 分步教程
│ ├── beginner/ # 初级教程
│ ├── intermediate/ # 中级教程
│ └── advanced/ # 高级教程
├── examples/ # 代码示例
│ ├── tensorflow/ # TensorFlow示例
│ ├── pytorch/ # PyTorch示例
│ └── jax/ # JAX示例
└── resources/ # 附加资源
├── papers/ # 研究论文
├── presentations/ # 幻灯片和演示文稿
└── references/ # 外部参考资料
- TPU介绍及其优势
- 搭建TPU开发环境
- 第一个TPU程序
- TPU上的基本张量操作
- TPU内存管理
- 数据流水线优化
- TPU上的模型并行
- 在TPU上训练神经网络
- 自定义TPU内核
- 性能分析和优化
- 大规模分布式训练
- TPU研究与前沿技术
浏览我们的实践示例集合:
- 图像分类模型
- 自然语言处理
- 推荐系统
- 强化学习
- 自定义训练循环
我们欢迎贡献!请查看我们的贡献指南了解详情:
- 如何提交问题
- 如何提议新教程
- 代码风格指南
- Pull Request流程
本项目采用 Apache License 2.0 开源许可证。 - 详见LICENSE文件。
Note: This repository is continuously updated with new tutorials and resources. Star ⭐ this repository to stay updated!