Skip to content

Pull requests: awslabs/awsome-distributed-training

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

feat: Support multiple SSH key types in easy-ssh.sh with auto-detection
#1031 opened Mar 20, 2026 by aravneelaws Loading…
5 of 7 tasks
Add DeepSpeed CI regression tests for QLoRA and GPT-103B
#1029 opened Mar 20, 2026 by paragao Loading…
Bump filelock from 3.16.1 to 3.20.3 in /3.test_cases/pytorch/nvrx dependencies Pull requests that update a dependency file python Pull requests that update python code
#1027 opened Mar 18, 2026 by dependabot bot Loading…
Bump transformers from 4.48.0 to 4.53.0 in /3.test_cases/pytorch/nvrx dependencies Pull requests that update a dependency file python Pull requests that update python code
#1026 opened Mar 18, 2026 by dependabot bot Loading…
Add NeMo RL GRPO training on P5en with EFA RDMA
#1025 opened Mar 17, 2026 by dmvevents Loading…
5 of 7 tasks
fix: overhaul CI workflows for FSDP regression tests
#1024 opened Mar 17, 2026 by paragao Loading…
Add OSMO AMR Navigation test case
#1018 opened Mar 12, 2026 by KeitaW Loading…
1 of 3 tasks
Add NeMo RL GRPO training with fault tolerance (NVRx) on EKS
#1010 opened Mar 9, 2026 by dmvevents Loading…
6 tasks
Updating CF stack for GB200 local zone deployments
#968 opened Feb 17, 2026 by KeitaW Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.