Docs fix #2301

pggPL · 2025-10-24T10:30:11Z

Description

Our documentation returned a lot of warnings and it seems that some of them were rational. It turned out that half of our PyTorch API was not rendered. This PR fixes all the warnings and forces Github workflow to error out if any docs warning will appear.

The most problematic error was related to cyclic imports - this resulted in part of our PyTorch API not being rendered. Other were mostly related to wrong formatting and some warnings didn't result in anything wrong.

Our docs for 2.9 contains only part of PyTorch API exposed in 2.8, so this will need urgent fix I will update in separate PR.

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refactoring

Checklist:

[ x] I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

greptile-apps

Greptile Overview

Greptile Summary

This PR resolves critical documentation rendering issues where approximately half of the PyTorch API was not being rendered in the generated documentation. The root causes were cyclic import dependencies and reStructuredText formatting violations. The fix involves restructuring imports across JAX and PyTorch modules to break circular dependencies (primarily moving QuantizeLayout and torch_version imports to their source modules), correcting RST section underlines to match header text lengths, converting docstring formatting to proper RST syntax, and enforcing strict documentation builds in CI by adding the -W flag to treat warnings as errors.

Important Files Changed

Filename	Score	Overview
`.github/workflows/docs.yml`	5/5	Added `-W` flag to Sphinx builds to make documentation warnings fatal in CI
`transformer_engine/pytorch/utils.py`	5/5	Converted `torch_version` from module-level import to cached function to break cyclic dependencies
`transformer_engine/pytorch/cross_entropy.py`	5/5	Converted `parallel_cross_entropy` from bare function reference to documented wrapper function
`transformer_engine/jax/quantize/quantizer.py`	5/5	Removed `QuantizeLayout` from `__all__` to break circular dependency chain
`transformer_engine/jax/quantize/hadamard.py`	5/5	Changed `QuantizeLayout` import from delayed local import to top-level C++ extension import
`transformer_engine/jax/cpp_extensions/misc.py`	5/5	Moved `QuantizeLayout` import from relative path to C++ extension to break cycle
`transformer_engine/jax/cpp_extensions/activation.py`	5/5	Moved `QuantizeLayout` import from quantize module to C++ extension module
`transformer_engine/jax/cpp_extensions/quantization.py`	5/5	Moved `QuantizeLayout` import from quantize module to C++ extension module
`transformer_engine/jax/cpp_extensions/gemm.py`	5/5	Moved `QuantizeLayout` import from quantize module to C++ extension module
`transformer_engine/jax/cpp_extensions/normalization.py`	5/5	Moved `QuantizeLayout` import from quantize module to C++ extension module
`transformer_engine/jax/dense.py`	5/5	Moved `QuantizeLayout` import from local module to top-level package
`transformer_engine/pytorch/jit.py`	5/5	Changed `torch_version` import from current module to utils submodule
`transformer_engine/pytorch/distributed.py`	5/5	Changed `torch_version` import from package-level to explicit utils module
`transformer_engine/pytorch/quantization.py`	5/5	Reorganized imports with proper blank line separation for clarity
`transformer_engine/pytorch/module/linear.py`	5/5	Changed `torch_version` import to come from utils submodule
`transformer_engine/pytorch/module/layernorm_linear.py`	5/5	Changed `torch_version` import to come from utils submodule
`transformer_engine/pytorch/module/layernorm_mlp.py`	4/5	Changed `torch_version` import and improved docstring RST formatting
`transformer_engine/pytorch/ops/_common.py`	5/5	Changed `torch_version` import to come from utils submodule
`transformer_engine/pytorch/ops/basic/l2normalization.py`	5/5	Changed `torch_version` import to come from utils submodule
`transformer_engine/pytorch/transformer.py`	5/5	Fixed `torch_version` import and improved docstring RST formatting
`transformer_engine/pytorch/module/base.py`	5/5	Converted docstring code blocks from markdown to RST syntax
`transformer_engine/pytorch/attention/dot_product_attention/dot_product_attention.py`	5/5	Improved RST formatting of docstring with proper inline code and bullet lists
`transformer_engine/pytorch/attention/multi_head_attention.py`	5/5	Improved RST formatting of docstring with proper inline code and bullet lists
`transformer_engine/jax/flax/transformer.py`	5/5	Converted ASCII table in docstring to RST table directive
`docs/conf.py`	4/5	Added custom logging filter to suppress unavoidable duplicate namespace warnings
`docs/api/pytorch.rst`	4/5	Reorganized API documentation with new sections and deprecated functions section
`docs/debug.rst`	5/5	Added blank line after copyright header to fix RST parsing
`docs/debug/api.rst`	5/5	Added blank line after copyright header to fix RST parsing
`docs/debug/3_api_features.rst`	5/5	Extended section underline to match header length
`docs/debug/3_api_debug_setup.rst`	5/5	Extended section underlines to match header lengths
`docs/debug/4_distributed.rst`	5/5	Extended section underlines to match header lengths
`docs/debug/2_config_file_structure.rst`	5/5	Extended section underlines to match header lengths
`docs/debug/1_getting_started.rst`	0/5	Section underlines still don't match header text lengths exactly - will cause RST warnings
`docs/examples/attention/attention.ipynb`	5/5	Updated API reference links to lowercase format for Sphinx compatibility
`docs/examples/te_gemma/tutorial_generation_gemma_with_te.ipynb`	5/5	Fixed broken internal documentation link

Confidence score: 2/5

This PR contains one file (docs/debug/1_getting_started.rst) with incorrect RST section underline lengths that will cause Sphinx warnings, directly contradicting the PR's goal of eliminating all warnings
All other changes are formatting and import refactoring with minimal risk, but the incomplete fix in 1_getting_started.rst will cause the new CI check (which treats warnings as errors) to fail
Pay close attention to docs/debug/1_getting_started.rst - the underlines need to match the exact character length of each section header, not use a fixed-width approach

_{35 files reviewed, 8 comments}

_{Edit Code Review Agent Settings | Greptile}

docs/api/pytorch.rst

docs/debug/1_getting_started.rst

docs/conf.py

for more information, see https://pre-commit.ci

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

greptile-apps

Greptile Overview

Greptile Summary

This review covers only the most recent changes that address syntax and formatting issues identified in previous reviews. The developer has made targeted fixes to resolve trailing whitespace, RST heading underline mismatches, and missing newlines across several documentation files. These changes are part of the broader documentation quality enforcement effort.

The recent commits address:

Trailing whitespace in docs/conf.py (line 109)
RST heading underline length mismatches in docs/debug/1_getting_started.rst (6 sections)
Missing newline at end of file in docs/api/pytorch.rst

Important Files Changed

Filename	Score	Overview
docs/conf.py	5/5	Fixed trailing whitespace on line 109
docs/debug/1_getting_started.rst	3/5	Adjusted RST heading underlines but mismatches remain between title lengths and underline lengths
docs/api/pytorch.rst	5/5	Added missing newline at end of file

Confidence score: 3/5

This PR contains syntax fixes but incomplete RST formatting corrections that will still cause Sphinx warnings
Score reflects that while trailing whitespace and missing newlines are fixed, the RST heading underlines in docs/debug/1_getting_started.rst still don't match their title character counts (e.g., "Getting started" has 15 chars but 31 '=' underline chars), which will cause Sphinx build failures when the -W flag is active
Pay close attention to docs/debug/1_getting_started.rst - all six section heading underlines need to be adjusted to exactly match their title character counts before the CI will pass

Additional Comments (1)

docs/debug/3_api_debug_setup.rst, line 89 (link)

syntax: missing newline at end of file

_{34 files reviewed, 12 comments}

_{Edit Code Review Agent Settings | Greptile}

docs/api/pytorch.rst

transformer_engine/pytorch/quantization.py

docs/debug/2_config_file_structure.rst

docs/debug/3_api_debug_setup.rst

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

greptile-apps

Greptile Overview

Greptile Summary

This review covers only the changes made since the last review, not the entire PR. The developer has successfully resolved all remaining ReStructuredText (RST) formatting issues across five debug documentation files. These changes correct section underline mismatches where underline lengths did not match their corresponding title lengths—a strict RST syntax requirement that causes Sphinx documentation build warnings. The fixes affect files in docs/debug/ including 1_getting_started.rst, 3_api_debug_setup.rst, 3_api_features.rst, api.rst, and 4_distributed.rst. Each correction ensures that section titles and their underlines have exactly matching character counts (e.g., "API" with 3 '=' chars, "Getting started" with 15 '=' chars), which eliminates documentation warnings and ensures proper rendering of the PyTorch API documentation. These changes are part of a broader effort to fix all documentation warnings and enforce clean doc builds in CI.

Important Files Changed

Filename	Score	Overview
docs/debug/3_api_debug_setup.rst	5/5	Fixed RST underline lengths for three function headers (`initialize()`, `set_tensor_reduction_group()`, `set_weight_tensor_tp_group_reduce()`)
docs/debug/3_api_features.rst	5/5	Corrected section underline from 27 to 14 '=' chars to match "Debug features" title
docs/debug/api.rst	5/5	Fixed "API" section underline from 12 to 3 '=' chars
docs/debug/4_distributed.rst	5/5	Corrected underlines for main title and four subsections throughout the distributed training documentation
docs/debug/1_getting_started.rst	5/5	Fixed six section underlines to match title lengths across getting started guide

Confidence score: 5/5

This PR is safe to merge with minimal risk—all changes are pure RST formatting corrections with no functional impact
Score reflects that these are syntax-only corrections (underline length matching) with zero functional changes, no logic modifications, and no code execution paths affected
No files require special attention—all changes follow the same trivial pattern of correcting underline character counts to match section titles

_{5 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

docs/api/pytorch.rst

docs/examples/attention/attention.ipynb

transformer_engine/pytorch/cross_entropy.py

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

greptile-apps

Greptile Overview

Greptile Summary

This PR fixes critical documentation build warnings that prevented half of the PyTorch API from being rendered in v2.9 documentation. The changes resolve cyclic import issues, improve RST formatting, and enforce warning-free builds in CI.

Key changes:

Fixed cyclic import by moving torch_version from a cached function in __init__.py to a module-level variable in utils.py, with updated imports across affected modules
Reorganized docs/api/pytorch.rst with logical sections (Recipe availability, MoE functions, Communication-computation overlap, Deprecated functions) for better API discoverability
Improved RST docstring formatting with proper code literals (backticks), bullet lists, and table syntax for better Sphinx rendering
Added SPHINXOPTS="-W" to GitHub workflow to fail builds on warnings, preventing future documentation regressions
Added custom logging filter in conf.py to suppress unavoidable duplicate C++ namespace warnings
Cleared Jupyter notebook execution counts to reduce version control noise

Confidence Score: 5/5

This PR is safe to merge with minimal risk - it fixes documentation issues without affecting runtime behavior
All changes are focused on documentation quality improvements and formatting fixes. The cyclic import fix is well-executed with consistent updates across all dependent modules. The workflow enforcement ensures future documentation quality. Only minor formatting issue found (missing newline)
No files require special attention - only a trivial missing newline in docs/api/pytorch.rst

Important Files Changed

File Analysis

Filename	Score	Overview
.github/workflows/docs.yml	5/5	Added `-W` flag to Sphinx build to error on warnings, enforcing documentation quality
docs/conf.py	5/5	Added custom logging filter to suppress duplicate C++ namespace warnings, updated exclude patterns
transformer_engine/pytorch/init.py	5/5	Moved `torch_version` from function to module-level variable, fixing cyclic import issues
docs/api/pytorch.rst	4/5	Reorganized API docs with sections, moved deprecated functions, fixed rendering issues

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant GHA as GitHub Actions
    participant Sphinx as Sphinx Builder
    participant Code as Python Code
    participant Docs as Documentation

    Dev->>Code: Fix cyclic imports (torch_version)
    Dev->>Code: Update import paths
    Dev->>Docs: Reorganize API structure
    Dev->>Docs: Fix RST formatting issues
    Dev->>GHA: Add SPHINXOPTS="-W" flag
    
    Note over Dev,Docs: PR Changes Applied
    
    GHA->>Sphinx: Trigger docs build with -W
    Sphinx->>Code: Import Python modules
    Code-->>Sphinx: Successfully imported (no cycles)
    Sphinx->>Docs: Parse RST files
    Docs-->>Sphinx: Valid RST syntax
    Sphinx->>Sphinx: Apply custom warning filter
    Sphinx-->>GHA: Build succeeds (no warnings)
    
    Note over GHA: Future builds will error<br/>if warnings appear

_{9 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

docs/api/pytorch.rst

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

greptile-apps

Greptile Overview

Greptile Summary

Fixed critical documentation build issues that prevented half of the PyTorch API from rendering in version 2.9. Resolved cyclic import in pytorch/__init__.py by moving torch_version() to utils.py, and enforced documentation quality by adding -W flag to Sphinx build.

Cyclic import resolution: Moved torch_version() from pytorch/__init__.py to pytorch/utils.py to break import cycle that prevented API documentation generation
Sphinx enforcement: Added -W flag to GitHub workflow to treat documentation warnings as build errors
RST formatting fixes: Corrected underline mismatches in debug documentation files (15+ instances across multiple files)
Docstring improvements: Enhanced formatting with proper RST syntax (backticks, bullet lists) in attention, cross_entropy, and other modules
API documentation reorganization: Restructured pytorch.rst with logical sections and moved deprecated functions to the end
Import corrections: Fixed several import issues in JAX and PyTorch modules (e.g., QuantizeLayout in JAX, torch_version references)
Warning suppression: Added custom logging filter for unavoidable duplicate C++ declaration warnings

Confidence Score: 5/5

This PR is safe to merge with minimal risk - it fixes critical documentation issues without changing runtime behavior
All changes are documentation-focused (RST formatting, docstrings, import reorganization). The cyclic import fix is a proper refactoring that maintains the same API surface. The workflow change enforces quality going forward. No logic changes to core functionality.
No files require special attention

Important Files Changed

File Analysis

Filename	Score	Overview
.github/workflows/docs.yml	5/5	Added `-W` flag to make Sphinx treat warnings as errors, enforcing documentation quality
docs/conf.py	5/5	Added custom logging filter to suppress duplicate C++ declaration warnings for `transformer_engine` namespace
transformer_engine/pytorch/init.py	5/5	Fixed cyclic import by removing `torch_version()` function from `__init__.py` (moved to `utils.py`)
transformer_engine/pytorch/utils.py	5/5	Added `torch_version()` function here to resolve cyclic import issues
docs/api/pytorch.rst	5/5	Reorganized API documentation with sections, moved deprecated functions to the end, added missing newline
transformer_engine/pytorch/cross_entropy.py	5/5	Added explicit function wrapper with full docstring for `parallel_cross_entropy` to improve API documentation

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant GH as GitHub Workflow
    participant Sphinx as Sphinx Builder
    participant Init as pytorch/__init__.py
    participant Utils as pytorch/utils.py
    
    Dev->>GH: Push docs changes
    GH->>Sphinx: Run build with -W flag
    
    Note over Sphinx: Before: Warnings ignored
    Note over Sphinx: After: Warnings = Errors
    
    Sphinx->>Sphinx: Check RST formatting
    Sphinx->>Sphinx: Check docstring syntax
    
    Note over Init,Utils: Cyclic Import Fix
    Init->>Utils: torch_version() moved here
    Utils->>Utils: Function callable via import
    
    Sphinx->>Sphinx: Parse API docs successfully
    Sphinx->>GH: Build succeeds (no warnings)
    GH->>Dev: CI passes

_{35 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

greptile-apps

Greptile Overview

Greptile Summary

Fixed critical documentation build warnings that prevented half of the PyTorch API from being rendered in version 2.9 docs. The PR resolves cyclic import issues and enforces documentation quality going forward.

Key Changes:

Cyclic Import Fix: Moved torch_version() function from pytorch/__init__.py to pytorch/utils.py to break circular dependency that prevented Sphinx from properly importing and documenting modules
Workflow Enforcement: Added -W flag to Sphinx build to error out on any documentation warnings, preventing future documentation regressions
RST Formatting: Fixed hundreds of reStructuredText formatting issues including:
- Section underline length mismatches in debug documentation
- Code block syntax (converted markdown triple backticks to RST :: syntax)
- Improved docstring formatting for mathematical expressions and bullet lists
- Proper use of inline code with double backticks
API Documentation Reorganization: Restructured docs/api/pytorch.rst with logical sections (Recipe availability, MoE functions, Communication-computation overlap, Deprecated functions)
Function Wrappers: Added proper function wrappers with complete docstrings (e.g., parallel_cross_entropy) to replace bare Function.apply assignments that Sphinx couldn't document
Warning Suppression: Added custom logging filter in docs/conf.py to suppress unavoidable duplicate C++ declaration warnings for the transformer_engine namespace

Confidence Score: 5/5

This PR is safe to merge - it only fixes documentation and doesn't change runtime behavior
All changes are documentation-focused: RST formatting fixes, docstring improvements, reorganization of docs structure, and a critical cyclic import fix that only affects Sphinx's ability to introspect the codebase. The workflow change ensures future documentation quality. No runtime logic changes.
No files require special attention - all changes are straightforward documentation fixes

Important Files Changed

File Analysis

Filename	Score	Overview
.github/workflows/docs.yml	5/5	Added `-W` flag to error out on documentation warnings
transformer_engine/pytorch/init.py	5/5	Removed `torch_version()` function to break cyclic import; moved to utils.py
transformer_engine/pytorch/utils.py	5/5	Added `torch_version()` function moved from init.py to resolve cyclic imports
transformer_engine/pytorch/cross_entropy.py	5/5	Added proper wrapper function with full docstring for `parallel_cross_entropy` for documentation
docs/conf.py	5/5	Added custom logging filter to suppress duplicate C++ declaration warnings
docs/api/pytorch.rst	5/5	Reorganized API documentation sections and moved deprecated functions to dedicated section

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant Sphinx as Sphinx Build
    participant PyInit as pytorch/__init__.py
    participant Utils as pytorch/utils.py
    participant Docs as Documentation Output

    Note over Dev,Docs: Before: Cyclic Import Issue
    Sphinx->>PyInit: Import module
    PyInit->>PyInit: torch_version() function
    PyInit->>Utils: Import utils
    Utils->>PyInit: Import torch_version (CYCLIC!)
    Note over Sphinx: API not rendered properly

    Note over Dev,Docs: After: Fixed Import Chain
    Sphinx->>PyInit: Import module
    PyInit->>PyInit: torch_version as variable
    PyInit->>Utils: Import utils
    Utils->>Utils: torch_version() function
    Sphinx->>Docs: All APIs rendered correctly
    
    Note over Dev,Docs: Workflow Enforcement
    Dev->>Sphinx: Build docs with -W flag
    alt Warnings Found
        Sphinx-->>Dev: Build FAILS
    else No Warnings
        Sphinx-->>Docs: Build succeeds
    end

_{38 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

for more information, see https://pre-commit.ci

greptile-apps

Greptile Overview

Greptile Summary

This PR fixes critical documentation issues that prevented half of the PyTorch API from being rendered in Sphinx documentation and ensures future documentation quality through CI enforcement.

Key Changes

Cyclic import fix: Removed @functools.lru_cache decorator from torch_version() in transformer_engine/pytorch/__init__.py, converting it to a direct variable assignment to resolve Sphinx import issues
Missing API documentation: Converted parallel_cross_entropy from a simple function alias to a proper documented function with NumPy-style docstring, making it visible in generated docs
RST formatting fixes: Corrected underline length mismatches across all debug documentation files (15+ instances) and fixed code block syntax from markdown triple backticks to RST double-colon (::) format
API reorganization: Restructured docs/api/pytorch.rst with logical sections (Recipe availability, MoE functions, Communication-computation overlap, Deprecated functions) and moved deprecated functions to the bottom
CI enforcement: Added SPHINXOPTS="-W" to GitHub workflow to fail builds on any Sphinx warnings, preventing future documentation regressions
Warning suppression: Added custom Sphinx setup to filter unavoidable duplicate C++ namespace warnings

Impact

This addresses the urgent issue where v2.9 docs were missing significant portions of the PyTorch API that were present in v2.8. The changes are purely documentation-focused with backward compatibility maintained (e.g., _input parameter deprecation warning in parallel_cross_entropy).

Confidence Score: 5/5

This PR is safe to merge - it fixes critical documentation issues without changing runtime behavior
All changes are documentation-focused (RST formatting, docstrings, Sphinx configuration) or maintain backward compatibility (cross_entropy parameter renaming with deprecation warning). The cyclic import fix simplifies code without changing functionality. CI enforcement ensures no regressions.
No files require special attention - all changes are well-structured documentation fixes

Important Files Changed

File Analysis

Filename	Score	Overview
.github/workflows/docs.yml	5/5	Added `-W` flag to make Sphinx error on warnings, enforcing documentation quality
docs/conf.py	4/5	Added warning filter for duplicate C++ declarations, excluded sphinx_rtd_theme from patterns
docs/api/pytorch.rst	5/5	Reorganized API docs with sections, moved deprecated functions to bottom, added missing newline
transformer_engine/pytorch/init.py	5/5	Removed `torch_version()` function decorator to fix cyclic import issues with documentation
transformer_engine/pytorch/cross_entropy.py	5/5	Renamed `_input` to `input`, added proper function wrapper with NumPy-style docstring

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant CI as GitHub CI
    participant Sphinx as Sphinx Builder
    participant Docs as Documentation
    participant PyAPI as PyTorch API
    
    Note over Dev,PyAPI: Documentation Fix PR Flow
    
    Dev->>CI: Push PR with doc fixes
    CI->>Sphinx: Run with SPHINXOPTS="-W"
    
    Note over Sphinx: Critical Fix: Cyclic Import
    Sphinx->>PyAPI: Import PyTorch modules
    PyAPI-->>Sphinx: ✓ No cyclic imports (torch_version simplified)
    
    Note over Sphinx: Fix: RST Formatting
    Sphinx->>Docs: Parse RST files
    Docs-->>Sphinx: ✓ Underlines match titles
    Docs-->>Sphinx: ✓ Code blocks use :: syntax
    
    Note over Sphinx: Fix: API Documentation
    Sphinx->>PyAPI: Generate API docs
    PyAPI-->>Sphinx: ✓ parallel_cross_entropy has docstring
    PyAPI-->>Sphinx: ✓ All functions now rendered
    
    Note over Sphinx: Enhancement: Warning Filter
    Sphinx->>Sphinx: Filter C++ namespace duplicates
    
    Sphinx-->>CI: Build success (no warnings)
    CI-->>Dev: ✓ All checks pass

_{38 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

pggPL · 2025-11-04T15:01:37Z

/te-ci

docs/api/jax.rst

transformer_engine/pytorch/attention/dot_product_attention/dot_product_attention.py

ksivaman · 2025-11-04T15:14:29Z

transformer_engine/pytorch/__init__.py

+torch_version = PkgVersion(str(torch.__version__)).release
+assert torch_version >= (2, 1), f"Minimum torch version 2.1 required. Found {torch_version}."


Please change the instances at the call site as well. For e.g. linear.py still calls torch_version()

ksivaman · 2025-11-04T15:18:22Z

transformer_engine/pytorch/utils.py

+@functools.lru_cache(maxsize=None)
+def torch_version() -> tuple[int, ...]:
+    """Get PyTorch version"""
+    return PkgVersion(str(torch.__version__)).release


Is this redefined here because of cyclic import issues when import directly from __init__.py? We should have it defined only in one location.

If the init one is causing issues with import, please remove it and change all instance to import from utils instead

Yes, that is the case. And I cannot import utils before dynamically loading c++ lib, because it imports some other torch classes. So I guess there are 2 ways of doing that - create some file torch_version.py with only that or do what I did. I do not want to add file for one line function, what do you think?

I think creating a standalone torch_version.py file would be the most straightforward way to untangle these issues with import ordering.

An alternative would be changing __init__.py to manually import in the right order:

from transformer_engine.common import load_framework_extension load_framework_extension("torch") from transformer_engine.pytorch.utils import torch_version assert torch_version() >= (2, 1), f"Minimum torch version 2.1 required. Found {torch_version()}."

However, this is not ideal because we don't check the PyTorch version before loading transformer_engine_torch, so we might get linker errors instead of a good error message.

I think we can create a torch_version.py even if it is a single function @pggPL

transformer_engine/pytorch/cross_entropy.py

greptile-apps

_{55 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

ksivaman · 2025-11-25T18:39:07Z

/te-ci

ksivaman

LGTM, will check artifacts after CI

greptile-apps

_{55 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

docs/api/pytorch.rst

Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>

ksivaman · 2025-11-25T21:34:04Z

/te-ci

greptile-apps

_{55 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

ptrendx · 2025-11-25T22:21:25Z

transformer_engine/pytorch/cpu_offload.py

+        If provided, the offload stream is used for offloading and reloading.
+        Otherwise, a new stream is allocated internally. It can be other than None
+        only if manual_synchronization is True.


Why change in the indentation?

ptrendx · 2025-11-25T22:23:37Z

transformer_engine/pytorch/distributed.py

-        states: Dict[str, torch.Tensor]
+        Parameters
+        ----------
+        states : Dict[str, torch.Tensor]


There are other cases where there is no space between the parameter and the ':' - we should use one consistently (preferably the right one :-) ).

official numpy docs guide have space, so i changed this to have space everywhere

Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>

greptile-apps

_{55 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

- Standardize parameter documentation to use 'param : type' format (space before and after colon) per NumPy style guide - Fix inconsistent indentation in cpu_offload.py docstring - Modified 51 Python files across transformer_engine/pytorch Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

greptile-apps

_{86 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

for more information, see https://pre-commit.ci

greptile-apps

_{86 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

_{86 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

ksivaman · 2025-11-26T14:51:23Z

/te-ci

greptile-apps

_{86 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

pggPL · 2025-11-26T16:10:05Z

/te-ci

greptile-apps

_{86 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

* init Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * lines lenght Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * fix Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * fix Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * fix Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * subtitle --- fix in many files: Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * cross entropy _input -> input rename Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * cross entropy _input -> input rename Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * a lot of small fixes Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * torch_version() change Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing module and fix warnings Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * fix Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * removed training whitespace: Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * Update docs/api/pytorch.rst Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Paweł Gadziński <62263673+pggPL@users.noreply.github.com> * Fix import Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix more imports Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com> * Fix NumPy docstring parameter spacing and indentation - Standardize parameter documentation to use 'param : type' format (space before and after colon) per NumPy style guide - Fix inconsistent indentation in cpu_offload.py docstring - Modified 51 Python files across transformer_engine/pytorch Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> --------- Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com> Signed-off-by: Paweł Gadziński <62263673+pggPL@users.noreply.github.com> Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>

init

8124d3d

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

greptile-apps bot reviewed Oct 24, 2025

View reviewed changes

pggPL and others added 3 commits October 24, 2025 13:11

Merge remote-tracking branch 'upstream/main' into docs_refactor

a23a365

[pre-commit.ci] auto fixes from pre-commit.com hooks

2413bcb

for more information, see https://pre-commit.ci

fix

8dec718

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

greptile-apps bot reviewed Oct 24, 2025

View reviewed changes

lines lenght

3ff3ca0

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

greptile-apps bot reviewed Oct 24, 2025

View reviewed changes

fix

8118005

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

pggPL mentioned this pull request Oct 24, 2025

Docs fix for 2.9 #2302

Closed

ksivaman self-requested a review October 27, 2025 23:30

ptrendx reviewed Oct 27, 2025

View reviewed changes

fix

1a9b993

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

greptile-apps bot reviewed Oct 30, 2025

View reviewed changes

docs/api/pytorch.rst Outdated Show resolved Hide resolved

pggPL added 2 commits October 30, 2025 16:11

Merge remote-tracking branch 'upstream/main' into docs_refactor

dd792f8

fix

0116d34

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

greptile-apps bot reviewed Oct 30, 2025

View reviewed changes

pggPL and others added 2 commits November 4, 2025 15:45

subtitle --- fix in many files:

20ba719

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

Merge branch 'main' into docs_refactor

f800a1f

greptile-apps bot reviewed Nov 4, 2025

View reviewed changes

pggPL and others added 3 commits November 4, 2025 15:54

cross entropy _input -> input rename

19da61b

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

cross entropy _input -> input rename

15c6741

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

556ab28

for more information, see https://pre-commit.ci

greptile-apps bot reviewed Nov 4, 2025

View reviewed changes

ksivaman reviewed Nov 4, 2025

View reviewed changes

docs/api/jax.rst Outdated Show resolved Hide resolved

ksivaman reviewed Nov 4, 2025

View reviewed changes

transformer_engine/pytorch/attention/dot_product_attention/dot_product_attention.py Outdated Show resolved Hide resolved

ksivaman reviewed Nov 4, 2025

View reviewed changes

transformer_engine/pytorch/cross_entropy.py Show resolved Hide resolved

greptile-apps bot reviewed Nov 20, 2025

View reviewed changes

Merge branch 'main' into docs_refactor

837c9ae

ksivaman previously approved these changes Nov 25, 2025

View reviewed changes

greptile-apps bot reviewed Nov 25, 2025

View reviewed changes

docs/api/pytorch.rst Show resolved Hide resolved

Fix import

d6639ea

Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>

ksivaman dismissed their stale review via d6639ea November 25, 2025 21:31

Merge branch 'main' into docs_refactor

0eed047

greptile-apps bot reviewed Nov 25, 2025

View reviewed changes

ksivaman previously approved these changes Nov 25, 2025

View reviewed changes

ptrendx reviewed Nov 25, 2025

View reviewed changes

Fix more imports

57d5fec

Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>

ksivaman dismissed their stale review via 57d5fec November 25, 2025 23:41

greptile-apps bot reviewed Nov 25, 2025

View reviewed changes

greptile-apps bot reviewed Nov 26, 2025

View reviewed changes

[pre-commit.ci] auto fixes from pre-commit.com hooks

19522d5

for more information, see https://pre-commit.ci

greptile-apps bot reviewed Nov 26, 2025

View reviewed changes

Merge branch 'main' into docs_refactor

5e0c0c7

greptile-apps bot reviewed Nov 26, 2025

View reviewed changes

Merge remote-tracking branch 'upstream/main' into docs_refactor

634a477

greptile-apps bot reviewed Nov 26, 2025

View reviewed changes

fix

78ce893

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

greptile-apps bot reviewed Nov 26, 2025

View reviewed changes

ksivaman approved these changes Nov 26, 2025

View reviewed changes

ksivaman merged commit df39a7c into NVIDIA:main Nov 26, 2025
11 of 13 checks passed

		torch_version = PkgVersion(str(torch.__version__)).release
		assert torch_version >= (2, 1), f"Minimum torch version 2.1 required. Found {torch_version}."

Docs fix #2301

Docs fix #2301

Uh oh!

Conversation

pggPL commented Oct 24, 2025

Description

Type of change

Checklist:

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Greptile Summary

Important Files Changed

Confidence score: 2/5

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Greptile Overview

Greptile Summary

Important Files Changed

Confidence score: 3/5

Additional Comments (1)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Greptile Summary

Important Files Changed

Confidence score: 5/5

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment •

edited

Loading