Skip to content

Conversation

@Jintao-Huang
Copy link
Collaborator

@Jintao-Huang Jintao-Huang commented Oct 17, 2025

Features:

  1. During training, weights can be loaded directly from safetensors format and saved as safetensors weights. This eliminates the model conversion step.
  2. Multi-node weight conversion support, enabling adaptation for extremely large models.
  3. GRPO support: streaming generation of safetensors format state_dict from megatron model.
  4. LoRA weight export

@gemini-code-assist
Copy link
Contributor

Note

Gemini is unable to generate a summary for this pull request due to the file types involved not being currently supported.

@arvyanh
Copy link

arvyanh commented Oct 21, 2025

Is there a plan for when this will be implemented? Facing some disk space and efficiency issue doing the 235B and 671B models lately
I could try to implement it if not doing that right now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants