Skip to content

jorditorresBCN/supercomputing-for-ai

Repository files navigation

Cover Supercomputing for Artificial Intelligence: Foundations, Architectures, and Scaling Deep Learning
Author: Jordi Torres
ISBN: 979-831932835-9
Series: WATCH THIS SPACE Book Series – Barcelona
Publisher: Amazon KDP, 2025

Welcome to the official GitHub repository for the book supercomputing-for-ai.

This book is designed as a practical and accessible guide to High Performance Computing for Artificial Intelligence , covering the foundations of supercomputing, the tools and architectures required, and how to scale deep learning workloads efficiently.

This repository provides access to the companion materials referenced throughout the book.

When code is cheap, performance is expensive.

Supercomputing for Artificial Intelligence is a practical, systems-oriented guide to training modern AI models at scale.

This book is not about writing code faster.

It is about understanding what happens when that code runs — on GPUs, across nodes, under real resource constraints.

In an era where AI tools can generate entire training pipelines in minutes, the real engineering challenge has shifted: performance, scalability, efficiency, and informed trade-offs.
HPC for AI is about judgment, not recipes.

What this book is about

Supercomputing for Artificial Intelligence provides a rigorous yet hands-on introduction to High Performance Computing as it applies to modern AI workloads.

The focus is explicitly on training, not inference.

Readers are guided from foundational supercomputing concepts to the efficient and scalable training of deep learning models on real supercomputing platforms.

The book integrates:

  • computer architecture and modern GPU systems
  • parallel and distributed execution models
  • deep learning frameworks (TensorFlow and PyTorch)
  • performance analysis and scalability metrics
  • reproducible experiments on production-grade infrastructures

Rather than presenting isolated techniques, the book is structured as a learning path whose technical culmination is the ability to reason about and execute large-scale AI training workloads.

Why this book exists (and why now)

AI-assisted coding tools are changing how software is written. They are not changing the fundamental physics of computation.

Generating code is becoming trivial. Understanding bottlenecks, overheads, scaling limits, and cost–performance trade-offs is not. This book addresses that gap.

It is written for readers who want to:

  • understand why performance behaves the way it does
  • measure instead of guess
  • scale only when it makes sense
  • avoid mistaking “more GPUs” for “better systems”

In short: to develop engineering judgment in AI systems.

Technical scope and structure

The primary technical focus of the book is the training of AI models on high-performance computing systems. Later chapters guide the reader through:

  • efficient single-node training
  • data parallelism and distributed training
  • scalability analysis and diminishing returns
  • end-to-end workflows for modern deep learning models

This includes preparation for training contemporary Large Language Models on distributed GPU-based supercomputers.

Earlier chapters cover:

  • supercomputing fundamentals
  • system architecture
  • software environments and schedulers
  • classical parallel programming models

These sections can also be read independently as a rigorous introduction to High Performance Computing.

Topics such as inference optimization, deployment, and edge execution are introduced only where needed for system-level context.


Used in real courses, on real supercomputers

This book is currently used as core reference material in master’s-level courses on supercomputing and artificial intelligence, including the
HPC for AI course (MEI master, FIB–UPC).

All examples and experiments are designed to run on real supercomputing platforms, not simplified toy environments. Companion code and fully reproducible experiments are provided.


Open HTML edition (January 2026)

The second edition of Supercomputing for Artificial Intelligence will be published openly in HTML format at the end of January 2026, once the current publishing agreement concludes.

This open edition will make the full content freely accessible to students, researchers, and practitioners worldwide.


Get the book

The book is currently available in digital and print editions via Amazon


How to cite this book

Please use the following reference format when citing this book in academic or teaching materials:

Torres, J. (2025). Supercomputing for Artificial Intelligence: Foundations, Architectures, and Scaling Deep Learning. WATCH THIS SPACE Book Series – Barcelona. Amazon KDP. ISBN: 979-831932835-9.

BibTeX entry

@book{HPC4AI2025,
  author    = {Jordi Torres},
  title     = {Supercomputing for Artificial Intelligence: Foundations, Architectures, and Scaling Deep Learning},
  year      = {2025},
  publisher = {Amazon KDP},
  series    = {WATCH THIS SPACE Book Series – Barcelona},
  isbn      = {979-831932835-9},
}

About

Supercomputing for Artificial Intelligence

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors