Skip to content

[gdn] support GDN CP#16

Merged
Jintao-Huang merged 5 commits intomodelscope:mainfrom
Jintao-Huang:support_GDN_CP
Apr 6, 2026
Merged

[gdn] support GDN CP#16
Jintao-Huang merged 5 commits intomodelscope:mainfrom
Jintao-Huang:support_GDN_CP

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for context parallelism (CP) in the GatedDeltaNet module by adjusting sequence length calculations and implementing all-to-all communication between CP and head parallelism domains. The changes also include CP-aware parameter fetching for convolutions and gate calculations. Review feedback identified several critical issues, including a missing import for tensor_a2a_hp2cp, a NameError caused by a missing self. reference to A_log, and a potential configuration error in the groups parameter of the manual F.conv1d call.

@Jintao-Huang
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements Context Parallelism (CP) support within the GatedDeltaNet module, including adjustments for sequence length, parameter slicing for depthwise convolutions, and All-to-All communication for projections. The review feedback identifies critical issues with the sequence unpacking logic used during All-to-All operations, which is noted as being both inefficient and incorrect for packed sequences. Additionally, a bug was found in the F.conv1d call where the groups parameter could be incorrectly set to None, and it was suggested to remove the _unpack_sequence helper function to simplify the implementation.

@Jintao-Huang
Copy link
Copy Markdown
Collaborator Author

NVIDIA/Megatron-LM#2644

@Jintao-Huang Jintao-Huang merged commit 63e9036 into modelscope:main Apr 6, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants