Skip to content

Conversation

@pggPL
Copy link
Collaborator

@pggPL pggPL commented Oct 24, 2025

Description

Our documentation returned a lot of warnings and it seems that some of them were rational. It turned out that half of our PyTorch API was not rendered. This PR fixes all the warnings and forces Github workflow to error out if any docs warning will appear.

The most problematic error was related to cyclic imports - this resulted in part of our PyTorch API not being rendered. Other were mostly related to wrong formatting and some warnings didn't result in anything wrong.

Our docs for 2.9 contains only part of PyTorch API exposed in 2.8, so this will need urgent fix I will update in separate PR.

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Checklist:

  • [ x] I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR resolves critical documentation rendering issues where approximately half of the PyTorch API was not being rendered in the generated documentation. The root causes were cyclic import dependencies and reStructuredText formatting violations. The fix involves restructuring imports across JAX and PyTorch modules to break circular dependencies (primarily moving QuantizeLayout and torch_version imports to their source modules), correcting RST section underlines to match header text lengths, converting docstring formatting to proper RST syntax, and enforcing strict documentation builds in CI by adding the -W flag to treat warnings as errors.

Important Files Changed

Filename Score Overview
.github/workflows/docs.yml 5/5 Added -W flag to Sphinx builds to make documentation warnings fatal in CI
transformer_engine/pytorch/utils.py 5/5 Converted torch_version from module-level import to cached function to break cyclic dependencies
transformer_engine/pytorch/cross_entropy.py 5/5 Converted parallel_cross_entropy from bare function reference to documented wrapper function
transformer_engine/jax/quantize/quantizer.py 5/5 Removed QuantizeLayout from __all__ to break circular dependency chain
transformer_engine/jax/quantize/hadamard.py 5/5 Changed QuantizeLayout import from delayed local import to top-level C++ extension import
transformer_engine/jax/cpp_extensions/misc.py 5/5 Moved QuantizeLayout import from relative path to C++ extension to break cycle
transformer_engine/jax/cpp_extensions/activation.py 5/5 Moved QuantizeLayout import from quantize module to C++ extension module
transformer_engine/jax/cpp_extensions/quantization.py 5/5 Moved QuantizeLayout import from quantize module to C++ extension module
transformer_engine/jax/cpp_extensions/gemm.py 5/5 Moved QuantizeLayout import from quantize module to C++ extension module
transformer_engine/jax/cpp_extensions/normalization.py 5/5 Moved QuantizeLayout import from quantize module to C++ extension module
transformer_engine/jax/dense.py 5/5 Moved QuantizeLayout import from local module to top-level package
transformer_engine/pytorch/jit.py 5/5 Changed torch_version import from current module to utils submodule
transformer_engine/pytorch/distributed.py 5/5 Changed torch_version import from package-level to explicit utils module
transformer_engine/pytorch/quantization.py 5/5 Reorganized imports with proper blank line separation for clarity
transformer_engine/pytorch/module/linear.py 5/5 Changed torch_version import to come from utils submodule
transformer_engine/pytorch/module/layernorm_linear.py 5/5 Changed torch_version import to come from utils submodule
transformer_engine/pytorch/module/layernorm_mlp.py 4/5 Changed torch_version import and improved docstring RST formatting
transformer_engine/pytorch/ops/_common.py 5/5 Changed torch_version import to come from utils submodule
transformer_engine/pytorch/ops/basic/l2normalization.py 5/5 Changed torch_version import to come from utils submodule
transformer_engine/pytorch/transformer.py 5/5 Fixed torch_version import and improved docstring RST formatting
transformer_engine/pytorch/module/base.py 5/5 Converted docstring code blocks from markdown to RST syntax
transformer_engine/pytorch/attention/dot_product_attention/dot_product_attention.py 5/5 Improved RST formatting of docstring with proper inline code and bullet lists
transformer_engine/pytorch/attention/multi_head_attention.py 5/5 Improved RST formatting of docstring with proper inline code and bullet lists
transformer_engine/jax/flax/transformer.py 5/5 Converted ASCII table in docstring to RST table directive
docs/conf.py 4/5 Added custom logging filter to suppress unavoidable duplicate namespace warnings
docs/api/pytorch.rst 4/5 Reorganized API documentation with new sections and deprecated functions section
docs/debug.rst 5/5 Added blank line after copyright header to fix RST parsing
docs/debug/api.rst 5/5 Added blank line after copyright header to fix RST parsing
docs/debug/3_api_features.rst 5/5 Extended section underline to match header length
docs/debug/3_api_debug_setup.rst 5/5 Extended section underlines to match header lengths
docs/debug/4_distributed.rst 5/5 Extended section underlines to match header lengths
docs/debug/2_config_file_structure.rst 5/5 Extended section underlines to match header lengths
docs/debug/1_getting_started.rst 0/5 Section underlines still don't match header text lengths exactly - will cause RST warnings
docs/examples/attention/attention.ipynb 5/5 Updated API reference links to lowercase format for Sphinx compatibility
docs/examples/te_gemma/tutorial_generation_gemma_with_te.ipynb 5/5 Fixed broken internal documentation link

Confidence score: 2/5

  • This PR contains one file (docs/debug/1_getting_started.rst) with incorrect RST section underline lengths that will cause Sphinx warnings, directly contradicting the PR's goal of eliminating all warnings
  • All other changes are formatting and import refactoring with minimal risk, but the incomplete fix in 1_getting_started.rst will cause the new CI check (which treats warnings as errors) to fail
  • Pay close attention to docs/debug/1_getting_started.rst - the underlines need to match the exact character length of each section header, not use a fixed-width approach

35 files reviewed, 8 comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This review covers only the most recent changes that address syntax and formatting issues identified in previous reviews. The developer has made targeted fixes to resolve trailing whitespace, RST heading underline mismatches, and missing newlines across several documentation files. These changes are part of the broader documentation quality enforcement effort.

The recent commits address:

  • Trailing whitespace in docs/conf.py (line 109)
  • RST heading underline length mismatches in docs/debug/1_getting_started.rst (6 sections)
  • Missing newline at end of file in docs/api/pytorch.rst

Important Files Changed

Filename Score Overview
docs/conf.py 5/5 Fixed trailing whitespace on line 109
docs/debug/1_getting_started.rst 3/5 Adjusted RST heading underlines but mismatches remain between title lengths and underline lengths
docs/api/pytorch.rst 5/5 Added missing newline at end of file

Confidence score: 3/5

  • This PR contains syntax fixes but incomplete RST formatting corrections that will still cause Sphinx warnings
  • Score reflects that while trailing whitespace and missing newlines are fixed, the RST heading underlines in docs/debug/1_getting_started.rst still don't match their title character counts (e.g., "Getting started" has 15 chars but 31 '=' underline chars), which will cause Sphinx build failures when the -W flag is active
  • Pay close attention to docs/debug/1_getting_started.rst - all six section heading underlines need to be adjusted to exactly match their title character counts before the CI will pass

Additional Comments (1)

  1. docs/debug/3_api_debug_setup.rst, line 89 (link)

    syntax: missing newline at end of file

34 files reviewed, 12 comments

Edit Code Review Agent Settings | Greptile

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This review covers only the changes made since the last review, not the entire PR. The developer has successfully resolved all remaining ReStructuredText (RST) formatting issues across five debug documentation files. These changes correct section underline mismatches where underline lengths did not match their corresponding title lengths—a strict RST syntax requirement that causes Sphinx documentation build warnings. The fixes affect files in docs/debug/ including 1_getting_started.rst, 3_api_debug_setup.rst, 3_api_features.rst, api.rst, and 4_distributed.rst. Each correction ensures that section titles and their underlines have exactly matching character counts (e.g., "API" with 3 '=' chars, "Getting started" with 15 '=' chars), which eliminates documentation warnings and ensures proper rendering of the PyTorch API documentation. These changes are part of a broader effort to fix all documentation warnings and enforce clean doc builds in CI.

Important Files Changed

Filename Score Overview
docs/debug/3_api_debug_setup.rst 5/5 Fixed RST underline lengths for three function headers (initialize(), set_tensor_reduction_group(), set_weight_tensor_tp_group_reduce())
docs/debug/3_api_features.rst 5/5 Corrected section underline from 27 to 14 '=' chars to match "Debug features" title
docs/debug/api.rst 5/5 Fixed "API" section underline from 12 to 3 '=' chars
docs/debug/4_distributed.rst 5/5 Corrected underlines for main title and four subsections throughout the distributed training documentation
docs/debug/1_getting_started.rst 5/5 Fixed six section underlines to match title lengths across getting started guide

Confidence score: 5/5

  • This PR is safe to merge with minimal risk—all changes are pure RST formatting corrections with no functional impact
  • Score reflects that these are syntax-only corrections (underline length matching) with zero functional changes, no logic modifications, and no code execution paths affected
  • No files require special attention—all changes follow the same trivial pattern of correcting underline character counts to match section titles

5 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
@pggPL pggPL mentioned this pull request Oct 24, 2025
@ksivaman ksivaman self-requested a review October 27, 2025 23:30
Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR fixes critical documentation build warnings that prevented half of the PyTorch API from being rendered in v2.9 documentation. The changes resolve cyclic import issues, improve RST formatting, and enforce warning-free builds in CI.

Key changes:

  • Fixed cyclic import by moving torch_version from a cached function in __init__.py to a module-level variable in utils.py, with updated imports across affected modules
  • Reorganized docs/api/pytorch.rst with logical sections (Recipe availability, MoE functions, Communication-computation overlap, Deprecated functions) for better API discoverability
  • Improved RST docstring formatting with proper code literals (backticks), bullet lists, and table syntax for better Sphinx rendering
  • Added SPHINXOPTS="-W" to GitHub workflow to fail builds on warnings, preventing future documentation regressions
  • Added custom logging filter in conf.py to suppress unavoidable duplicate C++ namespace warnings
  • Cleared Jupyter notebook execution counts to reduce version control noise

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk - it fixes documentation issues without affecting runtime behavior
  • All changes are focused on documentation quality improvements and formatting fixes. The cyclic import fix is well-executed with consistent updates across all dependent modules. The workflow enforcement ensures future documentation quality. Only minor formatting issue found (missing newline)
  • No files require special attention - only a trivial missing newline in docs/api/pytorch.rst

Important Files Changed

File Analysis

Filename Score Overview
.github/workflows/docs.yml 5/5 Added -W flag to Sphinx build to error on warnings, enforcing documentation quality
docs/conf.py 5/5 Added custom logging filter to suppress duplicate C++ namespace warnings, updated exclude patterns
transformer_engine/pytorch/init.py 5/5 Moved torch_version from function to module-level variable, fixing cyclic import issues
docs/api/pytorch.rst 4/5 Reorganized API docs with sections, moved deprecated functions, fixed rendering issues

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant GHA as GitHub Actions
    participant Sphinx as Sphinx Builder
    participant Code as Python Code
    participant Docs as Documentation

    Dev->>Code: Fix cyclic imports (torch_version)
    Dev->>Code: Update import paths
    Dev->>Docs: Reorganize API structure
    Dev->>Docs: Fix RST formatting issues
    Dev->>GHA: Add SPHINXOPTS="-W" flag
    
    Note over Dev,Docs: PR Changes Applied
    
    GHA->>Sphinx: Trigger docs build with -W
    Sphinx->>Code: Import Python modules
    Code-->>Sphinx: Successfully imported (no cycles)
    Sphinx->>Docs: Parse RST files
    Docs-->>Sphinx: Valid RST syntax
    Sphinx->>Sphinx: Apply custom warning filter
    Sphinx-->>GHA: Build succeeds (no warnings)
    
    Note over GHA: Future builds will error<br/>if warnings appear
Loading

9 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

pggPL added 2 commits October 30, 2025 16:11
Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

Fixed critical documentation build issues that prevented half of the PyTorch API from rendering in version 2.9. Resolved cyclic import in pytorch/__init__.py by moving torch_version() to utils.py, and enforced documentation quality by adding -W flag to Sphinx build.

  • Cyclic import resolution: Moved torch_version() from pytorch/__init__.py to pytorch/utils.py to break import cycle that prevented API documentation generation
  • Sphinx enforcement: Added -W flag to GitHub workflow to treat documentation warnings as build errors
  • RST formatting fixes: Corrected underline mismatches in debug documentation files (15+ instances across multiple files)
  • Docstring improvements: Enhanced formatting with proper RST syntax (backticks, bullet lists) in attention, cross_entropy, and other modules
  • API documentation reorganization: Restructured pytorch.rst with logical sections and moved deprecated functions to the end
  • Import corrections: Fixed several import issues in JAX and PyTorch modules (e.g., QuantizeLayout in JAX, torch_version references)
  • Warning suppression: Added custom logging filter for unavoidable duplicate C++ declaration warnings

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk - it fixes critical documentation issues without changing runtime behavior
  • All changes are documentation-focused (RST formatting, docstrings, import reorganization). The cyclic import fix is a proper refactoring that maintains the same API surface. The workflow change enforces quality going forward. No logic changes to core functionality.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
.github/workflows/docs.yml 5/5 Added -W flag to make Sphinx treat warnings as errors, enforcing documentation quality
docs/conf.py 5/5 Added custom logging filter to suppress duplicate C++ declaration warnings for transformer_engine namespace
transformer_engine/pytorch/init.py 5/5 Fixed cyclic import by removing torch_version() function from __init__.py (moved to utils.py)
transformer_engine/pytorch/utils.py 5/5 Added torch_version() function here to resolve cyclic import issues
docs/api/pytorch.rst 5/5 Reorganized API documentation with sections, moved deprecated functions to the end, added missing newline
transformer_engine/pytorch/cross_entropy.py 5/5 Added explicit function wrapper with full docstring for parallel_cross_entropy to improve API documentation

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant GH as GitHub Workflow
    participant Sphinx as Sphinx Builder
    participant Init as pytorch/__init__.py
    participant Utils as pytorch/utils.py
    
    Dev->>GH: Push docs changes
    GH->>Sphinx: Run build with -W flag
    
    Note over Sphinx: Before: Warnings ignored
    Note over Sphinx: After: Warnings = Errors
    
    Sphinx->>Sphinx: Check RST formatting
    Sphinx->>Sphinx: Check docstring syntax
    
    Note over Init,Utils: Cyclic Import Fix
    Init->>Utils: torch_version() moved here
    Utils->>Utils: Function callable via import
    
    Sphinx->>Sphinx: Parse API docs successfully
    Sphinx->>GH: Build succeeds (no warnings)
    GH->>Dev: CI passes
Loading

35 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

pggPL and others added 2 commits November 4, 2025 15:45
Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

Fixed critical documentation build warnings that prevented half of the PyTorch API from being rendered in version 2.9 docs. The PR resolves cyclic import issues and enforces documentation quality going forward.

Key Changes:

  • Cyclic Import Fix: Moved torch_version() function from pytorch/__init__.py to pytorch/utils.py to break circular dependency that prevented Sphinx from properly importing and documenting modules
  • Workflow Enforcement: Added -W flag to Sphinx build to error out on any documentation warnings, preventing future documentation regressions
  • RST Formatting: Fixed hundreds of reStructuredText formatting issues including:
    • Section underline length mismatches in debug documentation
    • Code block syntax (converted markdown triple backticks to RST :: syntax)
    • Improved docstring formatting for mathematical expressions and bullet lists
    • Proper use of inline code with double backticks
  • API Documentation Reorganization: Restructured docs/api/pytorch.rst with logical sections (Recipe availability, MoE functions, Communication-computation overlap, Deprecated functions)
  • Function Wrappers: Added proper function wrappers with complete docstrings (e.g., parallel_cross_entropy) to replace bare Function.apply assignments that Sphinx couldn't document
  • Warning Suppression: Added custom logging filter in docs/conf.py to suppress unavoidable duplicate C++ declaration warnings for the transformer_engine namespace

Confidence Score: 5/5

  • This PR is safe to merge - it only fixes documentation and doesn't change runtime behavior
  • All changes are documentation-focused: RST formatting fixes, docstring improvements, reorganization of docs structure, and a critical cyclic import fix that only affects Sphinx's ability to introspect the codebase. The workflow change ensures future documentation quality. No runtime logic changes.
  • No files require special attention - all changes are straightforward documentation fixes

Important Files Changed

File Analysis

Filename Score Overview
.github/workflows/docs.yml 5/5 Added -W flag to error out on documentation warnings
transformer_engine/pytorch/init.py 5/5 Removed torch_version() function to break cyclic import; moved to utils.py
transformer_engine/pytorch/utils.py 5/5 Added torch_version() function moved from init.py to resolve cyclic imports
transformer_engine/pytorch/cross_entropy.py 5/5 Added proper wrapper function with full docstring for parallel_cross_entropy for documentation
docs/conf.py 5/5 Added custom logging filter to suppress duplicate C++ declaration warnings
docs/api/pytorch.rst 5/5 Reorganized API documentation sections and moved deprecated functions to dedicated section

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant Sphinx as Sphinx Build
    participant PyInit as pytorch/__init__.py
    participant Utils as pytorch/utils.py
    participant Docs as Documentation Output

    Note over Dev,Docs: Before: Cyclic Import Issue
    Sphinx->>PyInit: Import module
    PyInit->>PyInit: torch_version() function
    PyInit->>Utils: Import utils
    Utils->>PyInit: Import torch_version (CYCLIC!)
    Note over Sphinx: API not rendered properly

    Note over Dev,Docs: After: Fixed Import Chain
    Sphinx->>PyInit: Import module
    PyInit->>PyInit: torch_version as variable
    PyInit->>Utils: Import utils
    Utils->>Utils: torch_version() function
    Sphinx->>Docs: All APIs rendered correctly
    
    Note over Dev,Docs: Workflow Enforcement
    Dev->>Sphinx: Build docs with -W flag
    alt Warnings Found
        Sphinx-->>Dev: Build FAILS
    else No Warnings
        Sphinx-->>Docs: Build succeeds
    end
Loading

38 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

pggPL and others added 3 commits November 4, 2025 15:54
Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR fixes critical documentation issues that prevented half of the PyTorch API from being rendered in Sphinx documentation and ensures future documentation quality through CI enforcement.

Key Changes

  • Cyclic import fix: Removed @functools.lru_cache decorator from torch_version() in transformer_engine/pytorch/__init__.py, converting it to a direct variable assignment to resolve Sphinx import issues
  • Missing API documentation: Converted parallel_cross_entropy from a simple function alias to a proper documented function with NumPy-style docstring, making it visible in generated docs
  • RST formatting fixes: Corrected underline length mismatches across all debug documentation files (15+ instances) and fixed code block syntax from markdown triple backticks to RST double-colon (::) format
  • API reorganization: Restructured docs/api/pytorch.rst with logical sections (Recipe availability, MoE functions, Communication-computation overlap, Deprecated functions) and moved deprecated functions to the bottom
  • CI enforcement: Added SPHINXOPTS="-W" to GitHub workflow to fail builds on any Sphinx warnings, preventing future documentation regressions
  • Warning suppression: Added custom Sphinx setup to filter unavoidable duplicate C++ namespace warnings

Impact

This addresses the urgent issue where v2.9 docs were missing significant portions of the PyTorch API that were present in v2.8. The changes are purely documentation-focused with backward compatibility maintained (e.g., _input parameter deprecation warning in parallel_cross_entropy).

Confidence Score: 5/5

  • This PR is safe to merge - it fixes critical documentation issues without changing runtime behavior
  • All changes are documentation-focused (RST formatting, docstrings, Sphinx configuration) or maintain backward compatibility (cross_entropy parameter renaming with deprecation warning). The cyclic import fix simplifies code without changing functionality. CI enforcement ensures no regressions.
  • No files require special attention - all changes are well-structured documentation fixes

Important Files Changed

File Analysis

Filename Score Overview
.github/workflows/docs.yml 5/5 Added -W flag to make Sphinx error on warnings, enforcing documentation quality
docs/conf.py 4/5 Added warning filter for duplicate C++ declarations, excluded sphinx_rtd_theme from patterns
docs/api/pytorch.rst 5/5 Reorganized API docs with sections, moved deprecated functions to bottom, added missing newline
transformer_engine/pytorch/init.py 5/5 Removed torch_version() function decorator to fix cyclic import issues with documentation
transformer_engine/pytorch/cross_entropy.py 5/5 Renamed _input to input, added proper function wrapper with NumPy-style docstring

Sequence Diagram

sequenceDiagram
    participant Dev as Developer
    participant CI as GitHub CI
    participant Sphinx as Sphinx Builder
    participant Docs as Documentation
    participant PyAPI as PyTorch API
    
    Note over Dev,PyAPI: Documentation Fix PR Flow
    
    Dev->>CI: Push PR with doc fixes
    CI->>Sphinx: Run with SPHINXOPTS="-W"
    
    Note over Sphinx: Critical Fix: Cyclic Import
    Sphinx->>PyAPI: Import PyTorch modules
    PyAPI-->>Sphinx: ✓ No cyclic imports (torch_version simplified)
    
    Note over Sphinx: Fix: RST Formatting
    Sphinx->>Docs: Parse RST files
    Docs-->>Sphinx: ✓ Underlines match titles
    Docs-->>Sphinx: ✓ Code blocks use :: syntax
    
    Note over Sphinx: Fix: API Documentation
    Sphinx->>PyAPI: Generate API docs
    PyAPI-->>Sphinx: ✓ parallel_cross_entropy has docstring
    PyAPI-->>Sphinx: ✓ All functions now rendered
    
    Note over Sphinx: Enhancement: Warning Filter
    Sphinx->>Sphinx: Filter C++ namespace duplicates
    
    Sphinx-->>CI: Build success (no warnings)
    CI-->>Dev: ✓ All checks pass
Loading

38 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@pggPL
Copy link
Collaborator Author

pggPL commented Nov 4, 2025

/te-ci

Comment on lines 17 to 18
torch_version = PkgVersion(str(torch.__version__)).release
assert torch_version >= (2, 1), f"Minimum torch version 2.1 required. Found {torch_version}."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change the instances at the call site as well. For e.g. linear.py still calls torch_version()

Comment on lines 22 to 25
@functools.lru_cache(maxsize=None)
def torch_version() -> tuple[int, ...]:
"""Get PyTorch version"""
return PkgVersion(str(torch.__version__)).release
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this redefined here because of cyclic import issues when import directly from __init__.py? We should have it defined only in one location.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the init one is causing issues with import, please remove it and change all instance to import from utils instead

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is the case. And I cannot import utils before dynamically loading c++ lib, because it imports some other torch classes. So I guess there are 2 ways of doing that - create some file torch_version.py with only that or do what I did. I do not want to add file for one line function, what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think creating a standalone torch_version.py file would be the most straightforward way to untangle these issues with import ordering.

An alternative would be changing __init__.py to manually import in the right order:

from transformer_engine.common import load_framework_extension
load_framework_extension("torch")

from transformer_engine.pytorch.utils import torch_version
assert torch_version() >= (2, 1), f"Minimum torch version 2.1 required. Found {torch_version()}."

However, this is not ideal because we don't check the PyTorch version before loading transformer_engine_torch, so we might get linker errors instead of a good error message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can create a torch_version.py even if it is a single function @pggPL

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

55 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@ksivaman
Copy link
Member

/te-ci

ksivaman
ksivaman previously approved these changes Nov 25, 2025
Copy link
Member

@ksivaman ksivaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will check artifacts after CI

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

55 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
@ksivaman
Copy link
Member

/te-ci

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

55 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

ksivaman
ksivaman previously approved these changes Nov 25, 2025
Comment on lines 680 to 682
If provided, the offload stream is used for offloading and reloading.
Otherwise, a new stream is allocated internally. It can be other than None
only if manual_synchronization is True.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change in the indentation?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

states: Dict[str, torch.Tensor]
Parameters
----------
states : Dict[str, torch.Tensor]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are other cases where there is no space between the parameter and the ':' - we should use one consistently (preferably the right one :-) ).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zrzut ekranu 2025-11-26 o 15 06 14

official numpy docs guide have space, so i changed this to have space everywhere

Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

55 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

- Standardize parameter documentation to use 'param : type' format (space before and after colon) per NumPy style guide
- Fix inconsistent indentation in cpu_offload.py docstring
- Modified 51 Python files across transformer_engine/pytorch

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

86 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

86 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

86 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@ksivaman
Copy link
Member

/te-ci

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

86 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
@pggPL
Copy link
Collaborator Author

pggPL commented Nov 26, 2025

/te-ci

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

86 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@ksivaman ksivaman merged commit df39a7c into NVIDIA:main Nov 26, 2025
11 of 13 checks passed
KshitijLakhani pushed a commit that referenced this pull request Nov 27, 2025
* init

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* lines lenght

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* fix

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* fix

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* fix

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* subtitle --- fix in many files:

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* cross entropy _input -> input rename

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* cross entropy _input -> input rename

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* a lot of small fixes

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* torch_version() change

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add missing module and fix warnings

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* fix

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* removed training whitespace:

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* Update docs/api/pytorch.rst

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Signed-off-by: Paweł Gadziński <62263673+pggPL@users.noreply.github.com>

* Fix import

Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>

* Fix more imports

Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>

* Fix NumPy docstring parameter spacing and indentation

- Standardize parameter documentation to use 'param : type' format (space before and after colon) per NumPy style guide
- Fix inconsistent indentation in cpu_offload.py docstring
- Modified 51 Python files across transformer_engine/pytorch

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>

---------

Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
Signed-off-by: Paweł Gadziński <62263673+pggPL@users.noreply.github.com>
Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants