Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (1)
💤 Files with no reviewable changes (1)
📝 WalkthroughWalkthroughRemoved the Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/workloads/megatron_bridge/test_command_gen_strategy_slurm.py`:
- Around line 380-410: The test test_gpus_per_node reuses the same output_subdir
("out_gpus") for all parametrized cases which can cause cross-test interference;
update the test to supply a unique output_subdir per case by adding a distinct
value to the parameter set or computing one from the inputs (e.g., include
cmd_args_gpus_per_node and system_gpus_per_node in the subdir name) when calling
make_test_run so each invocation creates/uses its own directory; adjust the
parametrize tuple or the call site where make_test_run(...) is invoked to use
that unique subdirectory.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a9f4c7f8-41d4-49b2-a082-37db19e94f2b
📒 Files selected for processing (3)
src/cloudai/workloads/megatron_bridge/megatron_bridge.pysrc/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.pytests/workloads/megatron_bridge/test_command_gen_strategy_slurm.py
tests/workloads/megatron_bridge/test_command_gen_strategy_slurm.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tests/workloads/megatron_bridge/test_command_gen_strategy_slurm.py (2)
403-411: 🧹 Nitpick | 🔵 TrivialConsider cleaning up stale
cmd_args_overridesparameter.The test still passes
cmd_args_overrides={"gpus_per_node": 2}(line 407), but with the PR changes,gpus_per_nodein command args is no longer used—the system configuration value is used instead. This override has no effect on the test outcome.To improve test clarity, consider:
- Removing the stale override:
make_test_run(output_subdir="out_no_gpu_directives")- Explicitly setting
configured_slurm_system.gpus_per_nodeto verify the directive is skipped regardless of the system value♻️ Suggested clarification
def test_gpus_per_node_skipped_when_gpu_directives_unsupported( self, configured_slurm_system: SlurmSystem, make_test_run: Callable[..., TestRun] ) -> None: configured_slurm_system.supports_gpu_directives_cache = False - tr = make_test_run(cmd_args_overrides={"gpus_per_node": 2}, output_subdir="out_no_gpu_directives") + configured_slurm_system.gpus_per_node = 4 # Set explicitly to verify directive is skipped + tr = make_test_run(output_subdir="out_no_gpu_directives") cmd_gen = MegatronBridgeSlurmCommandGenStrategy(configured_slurm_system, tr) wrapper_content = self._wrapper_content(cmd_gen) - assert "gpus-per-node=2" not in wrapper_content - assert "gres=gpu:2" not in wrapper_content + assert "gpus-per-node=" not in wrapper_content + assert "gres=gpu:" not in wrapper_content🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/workloads/megatron_bridge/test_command_gen_strategy_slurm.py` around lines 403 - 411, Update the test_gpus_per_node_skipped_when_gpu_directives_unsupported test to remove the stale cmd_args_overrides by calling make_test_run without the {"gpus_per_node": 2} override and instead explicitly set configured_slurm_system.gpus_per_node to a representative value (e.g., 2 or 4) before constructing MegatronBridgeSlurmCommandGenStrategy; keep configured_slurm_system.supports_gpu_directives_cache = False and assert that "gpus-per-node=..." and "gres=gpu:..." do not appear in the wrapper_content to verify the system-level GPU setting is ignored when GPU directives are unsupported.
77-77:⚠️ Potential issue | 🟡 MinorRemove unused
gpus_per_nodefrom test fixture and test overrides.The
gpus_per_nodefield at line 77 (and line 407 intest_gpus_per_node_skipped_when_gpu_directives_unsupported) is not a valid field inMegatronBridgeCmdArgs. While it's silently accepted due toCmdArgshavingextra="allow", it's never used and creates confusion. Remove it from the fixture and test overrides to clarify thatgpus_per_nodeis a system-level config, not a command argument.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/workloads/megatron_bridge/test_command_gen_strategy_slurm.py` at line 77, Remove the unused "gpus_per_node" key from the test fixture and any test overrides that build or pass arguments for MegatronBridgeCmdArgs (e.g., the fixture used in test_command_gen_strategy_slurm and the override in test_gpus_per_node_skipped_when_gpu_directives_unsupported); since MegatronBridgeCmdArgs does not define gpus_per_node (CmdArgs allows extras), delete those entries and any assertions expecting it so the tests only use valid MegatronBridgeCmdArgs fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/ref_data/megatron-bridge.sbatch`:
- Line 22: Remove the unused gpus_per_node entry from the test configuration in
test_acceptance.py so the test no longer supplies an extra field that
MegatronBridgeCmdArgs ignores; keep num_gpus (or num_gpus=8) as the
authoritative GPU count used to generate the sbatch line, and update any related
test expectations/comments to reflect 8 GPUs rather than the removed
gpus_per_node value; do not change MegatronBridgeCmdArgs (which uses
ConfigDict(extra="allow")), only remove the redundant gpus_per_node key from the
test data.
---
Outside diff comments:
In `@tests/workloads/megatron_bridge/test_command_gen_strategy_slurm.py`:
- Around line 403-411: Update the
test_gpus_per_node_skipped_when_gpu_directives_unsupported test to remove the
stale cmd_args_overrides by calling make_test_run without the {"gpus_per_node":
2} override and instead explicitly set configured_slurm_system.gpus_per_node to
a representative value (e.g., 2 or 4) before constructing
MegatronBridgeSlurmCommandGenStrategy; keep
configured_slurm_system.supports_gpu_directives_cache = False and assert that
"gpus-per-node=..." and "gres=gpu:..." do not appear in the wrapper_content to
verify the system-level GPU setting is ignored when GPU directives are
unsupported.
- Line 77: Remove the unused "gpus_per_node" key from the test fixture and any
test overrides that build or pass arguments for MegatronBridgeCmdArgs (e.g., the
fixture used in test_command_gen_strategy_slurm and the override in
test_gpus_per_node_skipped_when_gpu_directives_unsupported); since
MegatronBridgeCmdArgs does not define gpus_per_node (CmdArgs allows extras),
delete those entries and any assertions expecting it so the tests only use valid
MegatronBridgeCmdArgs fields.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0d455467-ff08-4fb5-a671-96bdeabbee73
📒 Files selected for processing (4)
src/cloudai/workloads/megatron_bridge/megatron_bridge.pysrc/cloudai/workloads/megatron_bridge/slurm_command_gen_strategy.pytests/ref_data/megatron-bridge.sbatchtests/workloads/megatron_bridge/test_command_gen_strategy_slurm.py
💤 Files with no reviewable changes (1)
- src/cloudai/workloads/megatron_bridge/megatron_bridge.py
Summary
gpus_per_nodeif availableTest Plan
Additional Notes