Conversation
📝 WalkthroughWalkthroughAdds DSE reporting: new DSE report generator module, DSEReporter and template, unified system-metadata loader, reporter changes to include node info and emit per-iteration best-config TOML, registration/tests updated, and a minor pyproject import-linter tweak. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Poem
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 13
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cloudai/registration.py`:
- Around line 296-299: The registration places "dse" after "status" but
Registry.ordered_scenario_reports() doesn't include "dse" in its ordering map so
it falls into the default bucket and runs earlier; update the ordering map used
by Registry.ordered_scenario_reports() to include the "dse" key in the correct
position (between "status" and "tarball") so the DSEReporter runs in the
intended slot, and add a regression test that asserts the order (e.g., verify
Registry.ordered_scenario_reports() yields ["per_test","status","dse","tarball"]
or that DSEReporter runs after StatusReporter and before TarballReporter) to
prevent regressions; locate changes around the
Registry.ordered_scenario_reports() implementation and the registration call
using Registry().add_scenario_report("dse", DSEReporter,
ReportConfig(enable=True)).
In `@src/cloudai/report_generator/dse_report.py`:
- Around line 188-198: The current winner selection in _build_reward_chart_data
picks the step with highest reward regardless of TrajectoryStep.is_successful,
which can mark a failed run as "best"; update _build_reward_chart_data to first
filter steps to only those with is_successful True (e.g., successful_steps = [s
for s in steps if s.is_successful]) and then compute best_index using max over
successful_steps (and map back to the original index if you need the index into
the full steps list); if there are no successful steps, return None or set
best_index to None. Apply the same filtering and selection rule to the analogous
selection logic referenced in reporter.py so both outputs remain aligned.
- Around line 147-154: Wrap TOML parsing and pydantic validation calls in
try/except and return None on failure so malformed artifacts are treated as
skipped; specifically modify _step_elapsed_time to catch exceptions around
toml.load(f) and SlurmJobMetadata.model_validate(...) and return None on any
exception, and similarly wrap the code that loads best_step_dump using
toml.load(f) and TestRunDetails.model_validate(...) to catch errors and set the
result to None (or skip) instead of propagating the exception.
In `@src/cloudai/report_generator/util.py`:
- Around line 198-205: The code currently picks node_files[0] after
metadata_path.glob("node-*.toml"), which is nondeterministic; make selection
deterministic by sorting the glob result (e.g., sorted(node_files) or sorting by
.name) before picking the first element so the file chosen for
SlurmSystemMetadata.model_validate is stable across runs; update the logic
around node_files and the with node_files[0].open() usage to use the
sorted/ordered list.
- Around line 191-207: Narrow the exception handling and make node file
selection deterministic: replace the broad `except Exception as exc:` that
follows `toml.load(f)` and `SlurmSystemMetadata.model_validate(...)` with a
specific except clause catching `toml.TomlDecodeError` and the pydantic
validation error you use elsewhere (e.g., `ValidationError`) so the error line
becomes `except (toml.TomlDecodeError, ValidationError) as exc:`; also ensure
`node_files` is deterministic by using `node_files =
sorted(metadata_path.glob("node-*.toml"))` before selecting `node_files[0]`.
In `@src/cloudai/reporter.py`:
- Line 164: Add a module-level logger by creating logger =
logging.getLogger(__name__) after imports, then replace root logging calls that
use f-strings with the module logger and %-style lazy formatting (e.g., change
logging.info(f"Generated scenario report at {report_path}") to
logger.info("Generated scenario report at %s", report_path)); apply the same
change to the other occurrences referenced (the logging calls around the report
generation and related messages).
- Around line 156-157: The Jinja2 Environment is created without autoescaping
which can lead to XSS when rendering user-provided parameters; update the
jinja_env creation in reporter.py (the jinja_env = jinja2.Environment(...) line
that uses self.templates_dir and template =
jinja_env.get_template("dse-report.jinja2")) to enable autoescaping (e.g., pass
autoescape=jinja2.select_autoescape(['html','xml']) or autoescape=True) and
import jinja2.select_autoescape if needed so HTML/XML templates are escaped by
default before rendering.
- Around line 166-175: report_best_dse_config currently uses
tr.current_iteration (always 0) so only iteration 0 is processed; change the
loop to iterate all iterations for each DSE test run by enumerating iterations
instead of using tr.current_iteration: for each tr in
self.test_scenario.test_runs, iterate over either a known iteration count (e.g.,
for iteration in range(tr.num_iterations) or range(tr.total_iterations)) or
discover existing iteration directories under self.results_root / tr.name (e.g.,
list subfolders matching numeric iteration names) and use that iteration
variable when building tr_root and reading trajectory.csv; update references to
tr.current_iteration accordingly inside report_best_dse_config and ensure it
still skips non-dse jobs.
- Around line 177-181: The current block that reads trajectory.csv and loads the
best step details (using df, trajectory_file, best_step =
df.loc[df["reward"].idxmax()]["step"], best_step_details, tr_root and
CommandGenStrategy.TEST_RUN_DUMP_FILE_NAME then calling
TestRunDetails.model_validate) needs defensive error handling: check that df is
non-empty and that the "reward" column contains at least one non-NaN value
before calling idxmax(), and wrap the file access in a try/except or pre-check
best_step_details.exists() to handle missing files (FileNotFoundError) and log
an informative message via your logger before skipping or re-raising; ensure any
exception around TestRunDetails.model_validate is caught and logged with context
so failures in parsing the dump are visible.
In `@src/cloudai/util/dse-report.jinja2`:
- Around line 410-429: The copy button inside the <summary> is triggering the
parent <details> toggle because the click event bubbles; update the
".js-copy-config" click handler (the anonymous async function attached in
document.querySelectorAll(...).forEach) to accept the event parameter and call
event.preventDefault() and event.stopPropagation() at the top of the handler (or
alternatively move the button outside the <summary>), then keep the existing
clipboard logic so clicking copies without toggling the disclosure; the same
change should be applied to the other handler at the 642-647 region.
- Around line 432-440: Update the click handler for the tab widgets (the code
iterating over ".js-dse-tabs" and handling ".js-dse-tab-button" clicks) to
update ARIA states alongside CSS classes: set aria-selected="true" on the
clicked button and aria-selected="false" on all other buttons, ensure the tab
container has role="tablist" and each button has role="tab" (and a stable id),
and for panels (".js-dse-tab-panel") set role="tabpanel", aria-hidden="false"
for the active panel and aria-hidden="true" for inactive panels and ensure each
panel has aria-labelledby pointing to its tab button id (use dataset.tabTarget /
dataset.tabPanel to match). Also update tabindex on tabs (0 for active, -1 for
inactive) so keyboard focus flows correctly.
- Around line 557-665: Enable Jinja2 autoescaping when creating template
environments in DSEReporter.generate and
StatusReporter.generate_scenario_report: update the Environment construction to
pass autoescape=jinja2.select_autoescape(['html', 'xml']) (e.g.,
Environment(loader=..., autoescape=select_autoescape(['html','xml']))) so all
injected fields like summary.name, summary.gpu_label, summary.best_config_toml,
parameter row names/values, and any other user-controlled strings are
HTML-escaped by default; also remove/avoid any forced |safe usage in templates
(dse-report.jinja2 and general-report.jinja2) so escaped output is used.
In `@tests/test_reporter.py`:
- Around line 632-636: The test currently only checks that iteration "0"
best-config TOMLs exist after calling reporter.generate(); since dse_case_a has
iterations=2, add an assertion to also verify the iteration "1" best-config file
is generated (similar to the existing checks), i.e. assert that
(slurm_system.output_path / dse_case_a.name / "1" /
f"{dse_case_a.name}.toml").exists(); keep the same pattern used for dse_case_b
and the mixed report to ensure multi-iteration DSE cases are covered.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b12d97b5-a0dc-4aa7-a2ea-f11942cd5fbe
📒 Files selected for processing (11)
pyproject.tomlsrc/cloudai/registration.pysrc/cloudai/report_generator/dse_report.pysrc/cloudai/report_generator/util.pysrc/cloudai/reporter.pysrc/cloudai/util/base-report.jinja2src/cloudai/util/dse-report.jinja2src/cloudai/util/general-report.jinja2src/cloudai/util/general-slurm-report.jinja2tests/test_init.pytests/test_reporter.py
💤 Files with no reviewable changes (1)
- src/cloudai/util/general-slurm-report.jinja2
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (5)
src/cloudai/reporter.py (3)
170-179:⚠️ Potential issue | 🟠 MajorWrite best-config outputs for every DSE iteration.
self.test_scenario.test_runsholds the case definitions, sotr.current_iterationis still the default here. Multi-iteration DSE jobs will only inspect<case>/0/trajectory.csvand never emit configs for later iterations.♻️ Minimal fix
def report_best_dse_config(self): for tr in self.test_scenario.test_runs: if not tr.test.is_dse_job: continue - tr_root = self.results_root / tr.name / f"{tr.current_iteration}" - trajectory_file = tr_root / "trajectory.csv" - if not trajectory_file.is_file(): - logging.warning(f"No trajectory file found for {tr.name} at {trajectory_file}") - continue + for iteration in range(tr.iterations): + tr_root = self.results_root / tr.name / str(iteration) + trajectory_file = tr_root / "trajectory.csv" + if not trajectory_file.is_file(): + logging.warning(f"No trajectory file found for {tr.name} at {trajectory_file}") + continue - df = lazy.pd.read_csv(trajectory_file) - best_step = df.loc[df["reward"].idxmax()]["step"] - best_step_details = tr_root / f"{best_step}" / CommandGenStrategy.TEST_RUN_DUMP_FILE_NAME - with best_step_details.open() as f: - trd = TestRunDetails.model_validate(toml.load(f)) + df = lazy.pd.read_csv(trajectory_file) + best_step = df.loc[df["reward"].idxmax()]["step"] + best_step_details = tr_root / f"{best_step}" / CommandGenStrategy.TEST_RUN_DUMP_FILE_NAME + with best_step_details.open() as f: + trd = TestRunDetails.model_validate(toml.load(f)) - best_config_path = tr_root / f"{tr.name}.toml" - logging.info(f"Writing best config for {tr.name} to {best_config_path}") - with best_config_path.open("w") as f: - toml.dump(trd.test_definition.model_dump(), f) + best_config_path = tr_root / f"{tr.name}.toml" + logging.info(f"Writing best config for {tr.name} to {best_config_path}") + with best_config_path.open("w") as f: + toml.dump(trd.test_definition.model_dump(), f)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cloudai/reporter.py` around lines 170 - 179, report_best_dse_config only checks tr.current_iteration (default) so multi-iteration DSE jobs inspect only the 0 iteration; change the loop to iterate over all DSE iterations for each test run: for each tr in self.test_scenario.test_runs where tr.test.is_dse_job, enumerate the available iteration IDs (e.g. from tr.iterations if present or by listing directories under self.results_root / tr.name) and for each iteration set tr_root = self.results_root / tr.name / str(iteration) then check trajectory_file = tr_root / "trajectory.csv" and emit the best-config output; update references in report_best_dse_config from tr.current_iteration to the loop variable so configs are written for every iteration.
181-189:⚠️ Potential issue | 🟠 MajorSkip incomplete trajectories instead of aborting the whole DSE reporter.
idxmax()on an empty/all-NaN reward column andbest_step_details.open()on a missing dump both raise here. A single incomplete iteration will stopgenerate()before the HTML report is written.🛡️ Minimal hardening
df = lazy.pd.read_csv(trajectory_file) - best_step = df.loc[df["reward"].idxmax()]["step"] - best_step_details = tr_root / f"{best_step}" / CommandGenStrategy.TEST_RUN_DUMP_FILE_NAME + if df.empty or not {"step", "reward"} <= set(df.columns) or df["reward"].isna().all(): + logging.warning("Empty or invalid trajectory for %s at %s", tr.name, trajectory_file) + continue + best_step = int(df.loc[df["reward"].idxmax(), "step"]) + best_step_details = tr_root / str(best_step) / CommandGenStrategy.TEST_RUN_DUMP_FILE_NAME + if not best_step_details.is_file(): + logging.warning("Best-step dump not found for %s at %s", tr.name, best_step_details) + continue with best_step_details.open() as f: trd = TestRunDetails.model_validate(toml.load(f))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cloudai/reporter.py` around lines 181 - 189, The code assumes trajectory_file contains valid rewards and the best-step dump exists, causing generate() to abort on empty/NaN reward columns or missing dump; modify the block that reads df, computes best_step, and opens best_step_details so it first checks that df is non-empty and that df["reward"] has any finite values (skip and log if not), then compute best_step safely (avoid idxmax on empty/NaN), and before calling best_step_details.open() and TestRunDetails.model_validate(toml.load(f)) verify best_step_details.exists() (log and skip this tr iteration if missing) so the reporter continues and still writes the HTML; update related logs (e.g., the logging.info for best_config_path) to reflect skipped iterations.
160-161:⚠️ Potential issue | 🟠 MajorEnable autoescaping for the DSE HTML environment.
jinja2.Environment(...)defaults toautoescape=False, so descriptions, parameter values, and TOML content are rendered raw into the page. Because the template filename ends in.jinja2, the fix needsdefault=True(or an explicit.jinja2enablement), not just["html", "xml"].🔒 Minimal fix
- jinja_env = jinja2.Environment(loader=jinja2.FileSystemLoader(self.templates_dir)) + jinja_env = jinja2.Environment( + loader=jinja2.FileSystemLoader(self.templates_dir), + autoescape=jinja2.select_autoescape(default_for_string=True, default=True), + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cloudai/reporter.py` around lines 160 - 161, The Jinja environment is created without autoescaping (jinja2.Environment(loader=self.templates_dir)), causing raw HTML/TOML to be injected; update the jinja2.Environment call that creates jinja_env (and/or its use of jinja2.select_autoescape) to enable autoescaping for .jinja2 templates—either pass autoescape=jinja2.select_autoescape(default=True) or include "jinja2" in the select_autoescape list so templates loaded via templates_dir and referenced by template = jinja_env.get_template("dse-report.jinja2") are autoescaped.src/cloudai/report_generator/dse_report.py (2)
147-154:⚠️ Potential issue | 🟠 MajorTreat malformed TOML artifacts as skipped data, not a hard failure.
Missing files are already handled as best-effort, but TOML parse/validation errors in these two loads still bubble up and abort the whole DSE report. Return
Nonefor the step/iteration instead so one bad artifact does not break every summary.Also applies to: 282-283
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cloudai/report_generator/dse_report.py` around lines 147 - 154, The TOML parse/validation in _step_elapsed_time currently can raise and abort the report; change it to treat malformed TOML as skipped data by wrapping the toml.load(...) and SlurmJobMetadata.model_validate(...) in a try/except that catches parsing/validation errors (e.g., toml.TomlDecodeError, ValidationError, ValueError, or a broad Exception) and returns None on failure (optionally logging a debug/warn); apply the same defensive try/except-and-return-None pattern to the other similar TOML load/validate call referenced around lines 282-283 so bad artifacts don't crash the DSE report.
188-198:⚠️ Potential issue | 🟠 MajorDo not mark failed steps as the best configuration.
TrajectoryStep.is_successfulis computed but ignored in both winner selections, so a failed high-reward run can be highlighted in the chart and serialized asbest_config_toml. Filter to successful steps first, and keepDSEReporter.report_best_dse_config()on the same rule so the HTML and emitted TOMLs stay aligned.🐛 Minimal fix
def _build_reward_chart_data(steps: list[TrajectoryStep]) -> dict[str, Any] | None: if not steps: return None - best_index = max(range(len(steps)), key=lambda idx: steps[idx].reward) + successful_indexes = [idx for idx, step in enumerate(steps) if step.is_successful] + if not successful_indexes: + return None + best_index = max(successful_indexes, key=lambda idx: steps[idx].reward) return { "labels": [step.step for step in steps], "rewards": [step.reward for step in steps], "observations": [step.observation_text for step in steps], "best_index": best_index, @@ - best_step = max(trajectory_steps, key=lambda step: step.reward) + successful_steps = [step for step in trajectory_steps if step.is_successful] + if not successful_steps: + return None + best_step = max(successful_steps, key=lambda step: step.reward)Also applies to: 276-318
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cloudai/report_generator/dse_report.py` around lines 188 - 198, The reward chart builder currently picks the max reward regardless of success; change _build_reward_chart_data to first filter steps by TrajectoryStep.is_successful and operate on that filtered list (return None if no successful steps), and make the same change in DSEReporter.report_best_dse_config (use the same successful-only selection logic when choosing best_index and best_config_toml) so the HTML chart and emitted TOML remain aligned; reference the _build_reward_chart_data function and the DSEReporter.report_best_dse_config method when making the edits.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cloudai/report_generator/dse_report.py`:
- Line 21: Replace the direct logging.warning(f"...") calls with a module-level
logger using lazy %-style formatting: add a module logger (e.g., logger =
logging.getLogger(__name__)) near the top (after the existing import logging)
and update every logging.warning(f"...") occurrence in this file (the calls
around lines 21, ~227-232, and ~279) to use logger.warning("... %s ...", value)
or logger.warning("literal %d", int_val) so formatting is deferred and Ruff
G004/LOG015 is satisfied.
- Around line 137-144: The _normalize_gpu_family function incorrectly matches
short family names inside longer ones (e.g., "B200" in "GB200"), so change the
matching to prefer more specific (longer) family names first: iterate
GPU_HOURLY_COST_USD's keys sorted by descending length (or otherwise order by
specificity) and check if each family is in upper, returning the first match;
keep the early None handling and return None if nothing matches. Ensure you
reference _normalize_gpu_family and GPU_HOURLY_COST_USD when making the change.
In `@src/cloudai/report_generator/util.py`:
- Around line 188-207: The function load_system_metadata currently stops if
run_dir/"metadata" exists but has no node-*.toml or the first node file fails
validation; change the logic so when metadata_path (from run_dir) yields no
node_files or when SlurmSystemMetadata.model_validate raises, the function then
tries results_root/"metadata" (fallback_metadata_path) before returning None.
Concretely: when node_files is empty, set metadata_path to
fallback_metadata_path (if it exists) and re-evaluate node_files; and if
model_validate fails on the first node file from run_dir, catch the exception
and attempt to open/validate the first node-*.toml under fallback_metadata_path
(if present) before returning None. Ensure references are to
load_system_metadata, run_dir, results_root, metadata_path,
fallback_metadata_path, node_files, and SlurmSystemMetadata.model_validate.
---
Duplicate comments:
In `@src/cloudai/report_generator/dse_report.py`:
- Around line 147-154: The TOML parse/validation in _step_elapsed_time currently
can raise and abort the report; change it to treat malformed TOML as skipped
data by wrapping the toml.load(...) and SlurmJobMetadata.model_validate(...) in
a try/except that catches parsing/validation errors (e.g., toml.TomlDecodeError,
ValidationError, ValueError, or a broad Exception) and returns None on failure
(optionally logging a debug/warn); apply the same defensive
try/except-and-return-None pattern to the other similar TOML load/validate call
referenced around lines 282-283 so bad artifacts don't crash the DSE report.
- Around line 188-198: The reward chart builder currently picks the max reward
regardless of success; change _build_reward_chart_data to first filter steps by
TrajectoryStep.is_successful and operate on that filtered list (return None if
no successful steps), and make the same change in
DSEReporter.report_best_dse_config (use the same successful-only selection logic
when choosing best_index and best_config_toml) so the HTML chart and emitted
TOML remain aligned; reference the _build_reward_chart_data function and the
DSEReporter.report_best_dse_config method when making the edits.
In `@src/cloudai/reporter.py`:
- Around line 170-179: report_best_dse_config only checks tr.current_iteration
(default) so multi-iteration DSE jobs inspect only the 0 iteration; change the
loop to iterate over all DSE iterations for each test run: for each tr in
self.test_scenario.test_runs where tr.test.is_dse_job, enumerate the available
iteration IDs (e.g. from tr.iterations if present or by listing directories
under self.results_root / tr.name) and for each iteration set tr_root =
self.results_root / tr.name / str(iteration) then check trajectory_file =
tr_root / "trajectory.csv" and emit the best-config output; update references in
report_best_dse_config from tr.current_iteration to the loop variable so configs
are written for every iteration.
- Around line 181-189: The code assumes trajectory_file contains valid rewards
and the best-step dump exists, causing generate() to abort on empty/NaN reward
columns or missing dump; modify the block that reads df, computes best_step, and
opens best_step_details so it first checks that df is non-empty and that
df["reward"] has any finite values (skip and log if not), then compute best_step
safely (avoid idxmax on empty/NaN), and before calling best_step_details.open()
and TestRunDetails.model_validate(toml.load(f)) verify
best_step_details.exists() (log and skip this tr iteration if missing) so the
reporter continues and still writes the HTML; update related logs (e.g., the
logging.info for best_config_path) to reflect skipped iterations.
- Around line 160-161: The Jinja environment is created without autoescaping
(jinja2.Environment(loader=self.templates_dir)), causing raw HTML/TOML to be
injected; update the jinja2.Environment call that creates jinja_env (and/or its
use of jinja2.select_autoescape) to enable autoescaping for .jinja2
templates—either pass autoescape=jinja2.select_autoescape(default=True) or
include "jinja2" in the select_autoescape list so templates loaded via
templates_dir and referenced by template =
jinja_env.get_template("dse-report.jinja2") are autoescaped.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f2487e56-c4d4-4388-bb81-d67812611b63
📒 Files selected for processing (3)
src/cloudai/report_generator/dse_report.pysrc/cloudai/report_generator/util.pysrc/cloudai/reporter.py
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
src/cloudai/util/dse-report.jinja2 (1)
435-445:⚠️ Potential issue | 🟡 MinorTab ARIA state not synchronized on click.
The click handler toggles CSS classes but doesn't update
aria-selectedattributes. While the initial HTML correctly setsaria-selected="true"on the first tab (line 553), switching tabs via JavaScript leaves screen readers with stale state.♿ Proposed fix to sync ARIA state
buttons.forEach((button) => { button.addEventListener("click", function () { const target = button.dataset.tabTarget; - buttons.forEach((candidate) => candidate.classList.toggle("is-active", candidate === button)); + buttons.forEach((candidate) => { + const isActive = candidate === button; + candidate.classList.toggle("is-active", isActive); + candidate.setAttribute("aria-selected", isActive ? "true" : "false"); + }); panels.forEach((panel) => panel.classList.toggle("is-active", panel.dataset.tabPanel === target)); }); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cloudai/util/dse-report.jinja2` around lines 435 - 445, The click handler for .js-dse-tab-button currently only toggles CSS classes and leaves ARIA state stale; update the event listener inside the document.querySelectorAll(".js-dse-tabs") loop so that when a button is clicked it also sets aria-selected="true" on the active button and aria-selected="false" on all other buttons (use the buttons NodeList) and likewise update panels' accessibility state (e.g., set aria-hidden appropriately on .js-dse-tab-panel elements based on panel.dataset.tabPanel === target) so screen readers reflect the active tab.src/cloudai/report_generator/dse_report.py (1)
302-303:⚠️ Potential issue | 🟠 MajorMissing error handling for TOML parsing in
_build_iteration_summary.While
_step_elapsed_timenow has error handling (lines 155-159), thebest_step_dumploading at lines 302-303 can still raise exceptions fromtoml.load()orTestRunDetails.model_validate(), aborting the entire summary build.🛡️ Proposed fix
with best_step_dump.open() as f: - test_run_details = TestRunDetails.model_validate(toml.load(f)) + try: + test_run_details = TestRunDetails.model_validate(toml.load(f)) + except Exception as exc: + logging.warning(f"Failed to load test run details for {test_case.name} at {best_step_dump}: {exc}") + return None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cloudai/report_generator/dse_report.py` around lines 302 - 303, The call that loads and validates the best step TOML in _build_iteration_summary can raise exceptions; wrap the block that opens best_step_dump, toml.load(f) and TestRunDetails.model_validate(...) in a try/except, catch broad parsing/validation errors (e.g., Exception), log the error with context (including best_step_dump and the exception) and skip or return a safe default summary for that iteration instead of letting the exception propagate; reference the best_step_dump variable and TestRunDetails.model_validate to find the exact lines to modify.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cloudai/report_generator/dse_report.py`:
- Around line 210-215: The returned "best_index" currently uses best_step.step
(the trajectory step number) which is treated as an array index by the JS chart;
change it to the positional index within the steps list. Replace best_step.step
with the index of best_step in steps (e.g., find the first i where steps[i] is
best_step or where steps[i].step == best_step.step) so "best_index" refers to
the array position used by the chart.
In `@src/cloudai/reporter.py`:
- Around line 184-192: The loop prematurely returns when best_step_details is
missing or TestRunDetails.model_validate fails, which stops processing remaining
test runs; change the early exits to continue so the loop moves on to the next
tr: replace the two return statements in the block that checks
best_step_details.is_file() and the except handling around
TestRunDetails.model_validate(toml.load(f)) with continue, keeping the
logging.warning calls (refer to best_step_details, tr,
TestRunDetails.model_validate and the existing logging.warning lines) so other
DSE test runs are still processed.
In `@src/cloudai/util/dse-report.jinja2`:
- Around line 461-469: The template compares array index (idx) to
chartData.best_index but _build_reward_chart_data in dse_report.py currently
sets best_index to best_step.step (a step number), causing mismatches for
non-contiguous steps; update _build_reward_chart_data so chartData.best_index
holds the array index of the best step (e.g., compute the index where steps[i]
== best_step.step or set best_index to the loop index when you identify the
best) or alternatively include best_step.step as best_step and change the
template comparison to compare step values—modify the function that builds
reward chart data (_build_reward_chart_data / any variables assembling
chartData) so chartData.best_index and the template use the same coordinate
system (index vs step) consistently.
---
Duplicate comments:
In `@src/cloudai/report_generator/dse_report.py`:
- Around line 302-303: The call that loads and validates the best step TOML in
_build_iteration_summary can raise exceptions; wrap the block that opens
best_step_dump, toml.load(f) and TestRunDetails.model_validate(...) in a
try/except, catch broad parsing/validation errors (e.g., Exception), log the
error with context (including best_step_dump and the exception) and skip or
return a safe default summary for that iteration instead of letting the
exception propagate; reference the best_step_dump variable and
TestRunDetails.model_validate to find the exact lines to modify.
In `@src/cloudai/util/dse-report.jinja2`:
- Around line 435-445: The click handler for .js-dse-tab-button currently only
toggles CSS classes and leaves ARIA state stale; update the event listener
inside the document.querySelectorAll(".js-dse-tabs") loop so that when a button
is clicked it also sets aria-selected="true" on the active button and
aria-selected="false" on all other buttons (use the buttons NodeList) and
likewise update panels' accessibility state (e.g., set aria-hidden appropriately
on .js-dse-tab-panel elements based on panel.dataset.tabPanel === target) so
screen readers reflect the active tab.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e2d81fb0-0478-4397-84c8-41bf7010f53d
📒 Files selected for processing (5)
src/cloudai/_core/registry.pysrc/cloudai/report_generator/dse_report.pysrc/cloudai/reporter.pysrc/cloudai/util/dse-report.jinja2tests/test_reporter.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_reporter.py`:
- Around line 385-388: Remove the dead helper function _create_non_dse_iteration
from tests/test_reporter.py; locate the standalone function definition named
_create_non_dse_iteration(case: TestRun, iteration: int, results_root: Path) and
delete it (and any immediately adjacent blank lines) to keep the test
scaffolding minimal—no replacement code required unless referenced elsewhere (if
referenced, instead update those call sites to remove or replace usage).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4ca1d988-6f4e-45a0-bf45-175a9d6465bb
📒 Files selected for processing (2)
src/cloudai/util/dse-report.jinja2tests/test_reporter.py
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/cloudai/report_generator/dse_report.py`:
- Around line 210-215: The current best_index calculation ("best_index":
best_step.step - 1) is wrong for 0-indexed arrays; replace it with an index
derived from the steps list itself (e.g., use steps.index(best_step) if
best_step is the same object in the list, or use next(i for i, s in
enumerate(steps) if s.step == best_step.step)) so best_index correctly points to
the array position; update the return dict construction that builds
"labels"/"rewards"/"observations" to use this corrected best_index value.
- Line 336: Guard against test_run_details.test_definition being None before
calling model_dump: in the assignment that builds best_config_toml (currently
calling toml.dumps(test_run_details.test_definition.model_dump())), check
whether test_run_details.test_definition is not None and only call model_dump()
when present; otherwise set best_config_toml to a safe default (e.g., an empty
string or None). Update the code path that constructs best_config_toml in
dse_report.py to reference test_run_details.test_definition and use the
conditional to avoid AttributeError from model_dump().
In `@tests/test_reporter.py`:
- Around line 519-524: The test and implementation disagree on best_index
because the code in src/cloudai/report_generator/dse_report.py computes
best_index as best_step.step - 1 (off-by-one); update the implementation to set
best_index to best_step.step (no subtraction) when building the
reward_chart_data and also update the test in tests/test_reporter.py to expect
"best_index": 1 for the sample rewards [ -10.0, -5.0, -7.0 ] so the chart
highlights step 1 as the best step.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: d8e86ed9-968b-4819-b4b9-607bd03deeaf
📒 Files selected for processing (4)
src/cloudai/report_generator/dse_report.pysrc/cloudai/reporter.pysrc/cloudai/util/dse-report.jinja2tests/test_reporter.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/cloudai/reporter.py (1)
176-176: 🧹 Nitpick | 🔵 TrivialUse %-style formatting for consistency.
Line 110 uses
logging.info("Generated scenario report at %s", report_path)with %-style lazy formatting, but line 176 uses an f-string. For consistency within the same file and lazy evaluation benefits, consider using %-style here as well.♻️ Suggested fix
- logging.info(f"Generated scenario report at {report_path}") + logging.info("Generated scenario report at %s", report_path)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/cloudai/reporter.py` at line 176, Replace the f-string logging call so it uses %-style lazy formatting for consistency and performance; locate the logging.info call that prints "Generated scenario report at {report_path}" in src/cloudai/reporter.py and change it to use logging.info("Generated scenario report at %s", report_path) instead of the f-string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/cloudai/reporter.py`:
- Line 176: Replace the f-string logging call so it uses %-style lazy formatting
for consistency and performance; locate the logging.info call that prints
"Generated scenario report at {report_path}" in src/cloudai/reporter.py and
change it to use logging.info("Generated scenario report at %s", report_path)
instead of the f-string.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 23f83393-1e47-481e-85a8-baa1b31e9754
📒 Files selected for processing (3)
README.mdsrc/cloudai/reporter.pytests/test_reporter.py
Summary
Make DSE reporting rich
Test Plan
Additional Notes
nixl_bench_ucx_dse_h100-dse-report.html