feat(gradio): surface live generation progress across LM, diffusion, and decode#954
feat(gradio): surface live generation progress across LM, diffusion, and decode#9541larity wants to merge 12 commits intoace-step:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThreaded a step/chunk-level Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
acestep/models/sft/modeling_acestep_v15_base.py (1)
1831-1867: 🛠️ Refactor suggestion | 🟠 MajorAdd a docstring for the modified
generate_audiopublic API.Line 1857 introduces a new callback contract, but the function still lacks a docstring that documents key args/behavior.
As per coding guidelines, "Docstrings are mandatory for all modules, classes, and public functions."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/models/sft/modeling_acestep_v15_base.py` around lines 1831 - 1867, Add a comprehensive docstring to the public method generate_audio documenting its purpose, key parameters and types (e.g., text_hidden_states, lyric_hidden_states, src_latents, seed, infer_method, infer_steps, diffusion_guidance_sale, audio_cover_strength, precomputed_lm_hints_25Hz, audio_codes, use_progress_bar, and the new progress_callback callable), return value(s), side effects, default behaviors (cfg_interval_start/end, use_cache, use_adg, shift), and the new callback contract signature Callable[[int, int, str], None] including when it is invoked and what the three args represent; place the docstring immediately under the def generate_audio(...) line and follow the project's docstring style (short summary, parameter list with types and defaults, examples/notes if needed).acestep/models/base/modeling_acestep_v15_base.py (1)
1831-1867: 🛠️ Refactor suggestion | 🟠 MajorDocument the updated
generate_audioAPI contract.Line 1857 adds
progress_callback, but the modified public method still lacks a docstring for parameters and behavior.As per coding guidelines, "Docstrings are mandatory for all modules, classes, and public functions."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/models/base/modeling_acestep_v15_base.py` around lines 1831 - 1867, The public method generate_audio was extended with new parameters (notably progress_callback) but lacks a docstring; add a clear docstring to generate_audio describing purpose, all parameters (types, default values, optionality) including progress_callback: Callable[[int,int,str],None] and use_progress_bar/use_adg/seed/infer_method/infer_steps and returned value(s), side effects (e.g., uses seed, may modify src_latents), expected tensor shapes/dtypes for text_hidden_states/lyric_hidden_states/src_latents/etc., error cases/exceptions, and any threading/process-safety notes; ensure the docstring follows the project's style (numpy or Google) and is placed immediately under the generate_audio def so callers and auto-doc tools pick up the updated API contract.
🧹 Nitpick comments (4)
acestep/core/generation/handler/diffusion.py (1)
37-37: Type the newprogress_callbackparameter explicitly.Line 37 adds a new API seam; annotating it helps static checking and call-site clarity.
As per coding guidelines, "Type hints: Add type annotations for new/modified functions when practical."♻️ Proposed change
-from typing import Any, Dict, Optional +from typing import Any, Callable, Dict, Optional ... - progress_callback=None, + progress_callback: Optional[Callable[[int, int, str], None]] = None,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/core/generation/handler/diffusion.py` at line 37, The new parameter progress_callback added to the function in diffusion.py should be explicitly typed; update the function signature that declares progress_callback (e.g., def ...(..., progress_callback=None)) to use a typing annotation such as progress_callback: Optional[Callable[..., Any]] = None (or a more specific Callable signature if you know the callback args), and add the necessary imports (Optional, Callable, Any) from typing at the top of the file; ensure the annotation matches how progress_callback is invoked within the function.acestep/models/turbo/modeling_acestep_v15_turbo.py (1)
1847-1848: Docstring should document the new parameters.The
use_progress_barandprogress_callbackparameters were added togenerate_audiobut the docstring (not shown in the changes) may not document them. This is a minor documentation gap for public API clarity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/models/turbo/modeling_acestep_v15_turbo.py` around lines 1847 - 1848, Update the generate_audio docstring to document the two new parameters: describe use_progress_bar: bool (controls whether a progress bar is shown) and progress_callback: Optional[Callable[[int, int, str], None]] (called with current_step, total_steps, and a status message for each progress update). Add default values and expected behavior (e.g., progress_callback overrides or complements the progress bar) and include any thread-safety or concurrency notes if relevant so callers of generate_audio understand how to supply and use these parameters.acestep/core/generation/handler/vae_decode.py (1)
22-22: Consider adding type annotation for consistency.The
progress_callbackparameter lacks a type annotation, unlike other files in this PR (e.g.,service_generate.py,dit_generate.py). Adding the annotation would improve IDE support and maintain consistency.♻️ Suggested type annotation
+from typing import Callable, Optional + class VaeDecodeMixin: ... def tiled_decode( self, latents, chunk_size: Optional[int] = None, overlap: int = 64, offload_wav_to_cpu: Optional[bool] = None, - progress_callback=None, + progress_callback: Optional[Callable[[int, int, str], None]] = None, ):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/core/generation/handler/vae_decode.py` at line 22, The progress_callback parameter in the vae_decode function is missing a type annotation; add a consistent annotation such as progress_callback: Optional[Callable[[float], None]] (and import Optional, Callable from typing at the top if not present) to match other modules like service_generate.py and dit_generate.py; update the function signature in vae_decode.py (the symbol progress_callback) and ensure any references/calls still accept an Optional[Callable[[float], None]].acestep/models/mlx/dit_generate_test.py (1)
8-8: Consider adding a skip decorator for non-MLX environments.The direct import of
mlx_generate_diffusionat line 8 will fail in environments wheremlxis not installed (e.g., non-Apple Silicon machines). Consider wrapping with a conditional skip.♻️ Suggested skip decorator
import sys import types import unittest from unittest.mock import patch import numpy as np -from acestep.models.mlx.dit_generate import mlx_generate_diffusion +try: + from acestep.models.mlx.dit_generate import mlx_generate_diffusion + _MLX_AVAILABLE = True +except ImportError: + _MLX_AVAILABLE = False + mlx_generate_diffusion = None +@unittest.skipUnless(_MLX_AVAILABLE, "mlx not available") class MlxGenerateDiffusionProgressTests(unittest.TestCase):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/models/mlx/dit_generate_test.py` at line 8, The test imports mlx_generate_diffusion directly which will fail when the MLX package is absent; update the test to skip in non-MLX environments by using pytest.importorskip or a pytest.mark.skip/skipif decorator: either call pytest.importorskip("mlx") before importing or wrap the import/use of mlx_generate_diffusion (symbol: mlx_generate_diffusion in acestep.models.mlx.dit_generate) in a conditional skip so tests are skipped rather than erroring when mlx is not installed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@acestep/core/generation/handler/generate_music_execute.py`:
- Around line 77-86: The progress UI callback can be invoked out-of-order
because the code releases progress_lock before calling progress(); modify
_emit_progress so the progress(clamped, desc=...) call happens while holding
progress_lock (i.e., serialize the UI callback inside the same lock that updates
progress_state["value"]) and still catch/log exceptions from the callback; keep
the clamping logic (max(progress_state["value"], value)), update
progress_state["value"] before calling progress, and ensure the lock is released
only after the callback/exception handling completes to preserve monotonic UI
updates.
In `@acestep/core/generation/handler/service_generate_execute_test.py`:
- Line 82: Replace the lambda assigned to callback with a named function
definition to satisfy Ruff E731: create a function (e.g., progress_callback or
callback) that accepts (current, total, desc) and returns (current, total,
desc), then assign callback to that function (or use it directly) so the inline
lambda on the variable named callback is removed and a proper def is used.
In `@acestep/llm_inference.py`:
- Around line 1413-1418: The LM-phase progress bands created via
progress_callback=_make_phase_progress_callback(...) can remain unclosed if
PT/MLX loops exit early; after each successful metadata-generation and
code-generation phase (the calls using _make_phase_progress_callback at the
shown sites), explicitly invoke the phase progress callback with the terminal
fraction to close the band (e.g., call progress(0.3, "LLM metadata generation
complete") after the metadata phase and progress(0.5, "LLM code generation
complete") after the code generation phase). Locate the places that build
progress_callback via _make_phase_progress_callback and, immediately after the
corresponding PT/MLX/ML loop completes successfully, call the progress function
with the final fraction and a short completion message to guarantee the band
closes.
---
Outside diff comments:
In `@acestep/models/base/modeling_acestep_v15_base.py`:
- Around line 1831-1867: The public method generate_audio was extended with new
parameters (notably progress_callback) but lacks a docstring; add a clear
docstring to generate_audio describing purpose, all parameters (types, default
values, optionality) including progress_callback: Callable[[int,int,str],None]
and use_progress_bar/use_adg/seed/infer_method/infer_steps and returned
value(s), side effects (e.g., uses seed, may modify src_latents), expected
tensor shapes/dtypes for
text_hidden_states/lyric_hidden_states/src_latents/etc., error cases/exceptions,
and any threading/process-safety notes; ensure the docstring follows the
project's style (numpy or Google) and is placed immediately under the
generate_audio def so callers and auto-doc tools pick up the updated API
contract.
In `@acestep/models/sft/modeling_acestep_v15_base.py`:
- Around line 1831-1867: Add a comprehensive docstring to the public method
generate_audio documenting its purpose, key parameters and types (e.g.,
text_hidden_states, lyric_hidden_states, src_latents, seed, infer_method,
infer_steps, diffusion_guidance_sale, audio_cover_strength,
precomputed_lm_hints_25Hz, audio_codes, use_progress_bar, and the new
progress_callback callable), return value(s), side effects, default behaviors
(cfg_interval_start/end, use_cache, use_adg, shift), and the new callback
contract signature Callable[[int, int, str], None] including when it is invoked
and what the three args represent; place the docstring immediately under the def
generate_audio(...) line and follow the project's docstring style (short
summary, parameter list with types and defaults, examples/notes if needed).
---
Nitpick comments:
In `@acestep/core/generation/handler/diffusion.py`:
- Line 37: The new parameter progress_callback added to the function in
diffusion.py should be explicitly typed; update the function signature that
declares progress_callback (e.g., def ...(..., progress_callback=None)) to use a
typing annotation such as progress_callback: Optional[Callable[..., Any]] = None
(or a more specific Callable signature if you know the callback args), and add
the necessary imports (Optional, Callable, Any) from typing at the top of the
file; ensure the annotation matches how progress_callback is invoked within the
function.
In `@acestep/core/generation/handler/vae_decode.py`:
- Line 22: The progress_callback parameter in the vae_decode function is missing
a type annotation; add a consistent annotation such as progress_callback:
Optional[Callable[[float], None]] (and import Optional, Callable from typing at
the top if not present) to match other modules like service_generate.py and
dit_generate.py; update the function signature in vae_decode.py (the symbol
progress_callback) and ensure any references/calls still accept an
Optional[Callable[[float], None]].
In `@acestep/models/mlx/dit_generate_test.py`:
- Line 8: The test imports mlx_generate_diffusion directly which will fail when
the MLX package is absent; update the test to skip in non-MLX environments by
using pytest.importorskip or a pytest.mark.skip/skipif decorator: either call
pytest.importorskip("mlx") before importing or wrap the import/use of
mlx_generate_diffusion (symbol: mlx_generate_diffusion in
acestep.models.mlx.dit_generate) in a conditional skip so tests are skipped
rather than erroring when mlx is not installed.
In `@acestep/models/turbo/modeling_acestep_v15_turbo.py`:
- Around line 1847-1848: Update the generate_audio docstring to document the two
new parameters: describe use_progress_bar: bool (controls whether a progress bar
is shown) and progress_callback: Optional[Callable[[int, int, str], None]]
(called with current_step, total_steps, and a status message for each progress
update). Add default values and expected behavior (e.g., progress_callback
overrides or complements the progress bar) and include any thread-safety or
concurrency notes if relevant so callers of generate_audio understand how to
supply and use these parameters.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3fd82acf-ef83-4b87-8176-c13b518b9440
📒 Files selected for processing (22)
acestep/core/generation/handler/diffusion.pyacestep/core/generation/handler/generate_music_decode.pyacestep/core/generation/handler/generate_music_decode_test.pyacestep/core/generation/handler/generate_music_execute.pyacestep/core/generation/handler/generate_music_execute_test.pyacestep/core/generation/handler/mlx_vae_decode_native.pyacestep/core/generation/handler/service_generate.pyacestep/core/generation/handler/service_generate_execute.pyacestep/core/generation/handler/service_generate_execute_test.pyacestep/core/generation/handler/service_generate_test.pyacestep/core/generation/handler/vae_decode.pyacestep/core/generation/handler/vae_decode_chunks.pyacestep/core/generation/handler/vae_decode_mixin_test.pyacestep/core/generation/handler/vae_decode_test_helpers.pyacestep/llm_inference.pyacestep/llm_inference_cfg_fixes_test.pyacestep/models/base/modeling_acestep_v15_base.pyacestep/models/mlx/dit_generate.pyacestep/models/mlx/dit_generate_test.pyacestep/models/sft/modeling_acestep_v15_base.pyacestep/models/turbo/modeling_acestep_v15_turbo.pyacestep/models/turbo/modeling_acestep_v15_turbo_progress_test.py
acestep/core/generation/handler/service_generate_execute_test.py
Outdated
Show resolved
Hide resolved
|
@CodeRabbit I left the outside-diff docstring comments alone: They are not significant enough severity to expand the PR needlessly. modeling_acestep_v15_base.py |
|
Tip For best results, initiate chat on the files or code changes.
(^ᴗ^ )ฅ ✏️ Learnings added
🧠 Learnings used |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
acestep/llm_inference.py (2)
1438-1445:⚠️ Potential issue | 🟠 MajorHandle the skipped-CoT branch without assuming
user_metadataexists.This branch also runs when
use_cot_metas=False. With the defaultuser_metadata=None, Line 1444 raises before generation starts, and if this path is taken forinfer_type == "dit"the caller returns with progress still stuck at0.1.🐛 Proposed fix
else: # Use user-provided metadata if is_batch: logger.info("Batch Phase 1: Using user-provided metadata (skipping generation)") else: logger.info("Phase 1: Using user-provided metadata (skipping generation)") - metadata = {k: v for k, v in user_metadata.items() if v is not None} + metadata = {k: v for k, v in (user_metadata or {}).items() if v is not None} + progress(0.3, "LLM metadata generation complete")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/llm_inference.py` around lines 1438 - 1445, The skipped-CoT branch assumes user_metadata exists which can be None when use_cot_metas is False; update the else branch that logs "Using user-provided metadata (skipping generation)" to safely handle a None user_metadata by treating it as an empty dict (e.g., set metadata = {} when user_metadata is None) so metadata = {k: v for k, v in (user_metadata or {}).items() if v is not None}; ensure this change is applied where the variables is_batch, use_cot_metas, and user_metadata are used so downstream callers (including infer_type == "dit") don't error or leave progress stuck.
1179-1212:⚠️ Potential issue | 🟠 MajorClose each sequential batch item's progress before starting the next one.
_run_pt_single()and_run_mlx_single()can stop on EOS long beforecurrent == total. When that happens, the first callback from the next item jumps straight toindex * total + 1, so the batch bar can skip a large chunk of the previous item and look much farther along than it is.🧭 Suggested pattern
- with self._load_model_context(): + with self._load_model_context(): + batch_total = len(formatted_prompt_list) for i, formatted_prompt in enumerate(formatted_prompt_list): # Set seed for this item if provided ... - item_progress_callback = None + item_progress_callback = None + item_progress_state = {"current": 0, "total": 0, "desc": ""} if progress_callback is not None: - def _item_progress(current, total, desc, index=i, batch_total=len(formatted_prompt_list)): + def _item_progress(current, total, desc, index=i): + item_progress_state.update( + current=current, + total=total, + desc=desc, + ) progress_callback(index * total + current, batch_total * total, desc) item_progress_callback = _item_progress output_text = self._run_pt_single( ... progress_callback=item_progress_callback, ) output_texts.append(output_text) + if ( + progress_callback is not None + and item_progress_state["total"] > 0 + and item_progress_state["current"] < item_progress_state["total"] + ): + progress_callback( + (i + 1) * item_progress_state["total"], + batch_total * item_progress_state["total"], + item_progress_state["desc"], + )Apply the same stateful wrapper in
_run_mlx()'s sequential fallback.Also applies to: 4070-4100
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/llm_inference.py` around lines 1179 - 1212, The per-item progress callback can leave the batch bar mid-item if a previous _run_pt_single/_run_mlx_single ended early; before starting each new sequential item call progress_callback(index * total, batch_total * total, desc) to "close" the previous item's progress and then pass a stateful wrapper as item_progress_callback that increments from that base (use the same pattern used elsewhere for _run_mlx()). Update the code around formatted_prompt_list loop where item_progress_callback is created (referencing progress_callback, index=i, batch_total=len(formatted_prompt_list), _run_pt_single, and _run_mlx_single) to first advance the global progress to the start-of-item, then wrap the incoming per-item callbacks so they add the base offset when invoked; apply the same change to the other sequential fallback block (the region referenced around lines 4070-4100).acestep/core/generation/handler/generate_music_execute_test.py (1)
67-255: 🛠️ Refactor suggestion | 🟠 MajorExtract the repeated service invocation into a shared test helper.
Lines 67-255 repeat the same long
service_inputspayload and_run_generate_music_service_with_progress(...)call across four cases. The file is already over the 200-LOC cap, so every signature tweak now has several copies to keep in sync. Pull the common setup into a helper/fixture, or move the async estimator case into a sibling test module.As per coding guidelines, "Target module size: optimal <= 150 LOC, hard cap 200 LOC." and "Keep functions focused and short; extract helpers instead of nesting complexity."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/core/generation/handler/generate_music_execute_test.py` around lines 67 - 255, Extract the repeated service_inputs dict and the repeated _run_generate_music_service_with_progress(...) invocation into a shared test helper (e.g., create a factory function make_service_inputs() that returns the dict and a helper run_generate_with_defaults(progress, host_overrides=None, infer_method="ode") that calls host._run_generate_music_service_with_progress with the common parameters), update the four tests to call these helpers, and move the _AsyncEstimatorHost-based test to a separate test module if needed to keep the file under the 200-LOC cap; reference _run_generate_music_service_with_progress and _start_diffusion_progress_estimator/_AsyncEstimatorHost when locating code to change.
🧹 Nitpick comments (1)
acestep/llm_inference.py (1)
1277-1323: Document the two progress callback contracts.
generate_with_stop_condition()now takes a Gradio-styleprogress(fraction, desc)callback, whilegenerate_from_formatted_prompt()forwards a(current, total, desc)callback. Those are part of the callable contract now, but the modified docstrings still omit both parameters.📝 Minimal docstring/type-hint update
def generate_with_stop_condition( self, ... - progress=None, + progress: Optional[Callable[[float, str], None]] = None, ) -> Dict[str, Any]: """Two-phase LM generation: CoT generation followed by audio codes generation. ... seeds: Optional list of seeds for batch generation (for reproducibility). Only used when batch_size > 1. TODO: not used yet + progress: Optional callback invoked as `progress(fraction, description)`. def generate_from_formatted_prompt( self, ... stop_at_reasoning: bool = False, progress_callback: Optional[Callable[[int, int, str], None]] = None, ) -> Tuple[str, str]: """ Generate raw LM text output from a pre-built formatted prompt. ... constrained_decoding_debug: Whether to enable debug logging for constrained decoding stop_at_reasoning: If True, stop generation immediately after </think> tag (no audio codes) + progress_callback: Optional callback invoked as + `(current_step, total_steps, description)` during generation.As per coding guidelines, "Docstrings are mandatory for all new/modified Python modules, classes, and functions. Docstrings must be concise and include purpose plus key inputs/outputs and raised exceptions when relevant."
Also applies to: 2364-2395
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/llm_inference.py` around lines 1277 - 1323, Update the docstrings for generate_with_stop_condition and generate_from_formatted_prompt to document the two different progress callback contracts: (1) the Gradio-style progress(fraction: float, desc: str) used by generate_with_stop_condition and (2) the (current: int, total: int, desc: str) progress callback forwarded by generate_from_formatted_prompt; include types, meaning of parameters, when each is called, and how callers should implement them (e.g., fraction in [0,1] vs current/total integers), and add these parameters to the function signature docs (inputs) and returned/side-effect behavior in the Returns section for both functions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@acestep/core/generation/handler/generate_music_execute.py`:
- Around line 163-169: The code clears progress_thread immediately after a timed
join, losing the handle if the thread didn’t actually stop; modify the blocks
that set progress_thread = None (the branch guarded by
drained_runtime_progress/saw_runtime_progress and the later analogous branch) to
only set progress_thread to None after confirming the thread has terminated
(e.g., call progress_thread.join(timeout=1.0) then check if not
progress_thread.is_alive() before clearing it, and if it is still alive keep the
reference so final cleanup can join again), ensure stop_event is still set as
before and preserve the progress_thread variable until you can guarantee the
thread is finished.
---
Outside diff comments:
In `@acestep/core/generation/handler/generate_music_execute_test.py`:
- Around line 67-255: Extract the repeated service_inputs dict and the repeated
_run_generate_music_service_with_progress(...) invocation into a shared test
helper (e.g., create a factory function make_service_inputs() that returns the
dict and a helper run_generate_with_defaults(progress, host_overrides=None,
infer_method="ode") that calls host._run_generate_music_service_with_progress
with the common parameters), update the four tests to call these helpers, and
move the _AsyncEstimatorHost-based test to a separate test module if needed to
keep the file under the 200-LOC cap; reference
_run_generate_music_service_with_progress and
_start_diffusion_progress_estimator/_AsyncEstimatorHost when locating code to
change.
In `@acestep/llm_inference.py`:
- Around line 1438-1445: The skipped-CoT branch assumes user_metadata exists
which can be None when use_cot_metas is False; update the else branch that logs
"Using user-provided metadata (skipping generation)" to safely handle a None
user_metadata by treating it as an empty dict (e.g., set metadata = {} when
user_metadata is None) so metadata = {k: v for k, v in (user_metadata or
{}).items() if v is not None}; ensure this change is applied where the variables
is_batch, use_cot_metas, and user_metadata are used so downstream callers
(including infer_type == "dit") don't error or leave progress stuck.
- Around line 1179-1212: The per-item progress callback can leave the batch bar
mid-item if a previous _run_pt_single/_run_mlx_single ended early; before
starting each new sequential item call progress_callback(index * total,
batch_total * total, desc) to "close" the previous item's progress and then pass
a stateful wrapper as item_progress_callback that increments from that base (use
the same pattern used elsewhere for _run_mlx()). Update the code around
formatted_prompt_list loop where item_progress_callback is created (referencing
progress_callback, index=i, batch_total=len(formatted_prompt_list),
_run_pt_single, and _run_mlx_single) to first advance the global progress to the
start-of-item, then wrap the incoming per-item callbacks so they add the base
offset when invoked; apply the same change to the other sequential fallback
block (the region referenced around lines 4070-4100).
---
Nitpick comments:
In `@acestep/llm_inference.py`:
- Around line 1277-1323: Update the docstrings for generate_with_stop_condition
and generate_from_formatted_prompt to document the two different progress
callback contracts: (1) the Gradio-style progress(fraction: float, desc: str)
used by generate_with_stop_condition and (2) the (current: int, total: int,
desc: str) progress callback forwarded by generate_from_formatted_prompt;
include types, meaning of parameters, when each is called, and how callers
should implement them (e.g., fraction in [0,1] vs current/total integers), and
add these parameters to the function signature docs (inputs) and
returned/side-effect behavior in the Returns section for both functions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0fe5cfb8-7bd4-4759-ac6c-809202c128f6
📒 Files selected for processing (7)
acestep/core/generation/handler/generate_music_execute.pyacestep/core/generation/handler/generate_music_execute_test.pyacestep/core/generation/handler/service_generate_execute_test.pyacestep/core/generation/handler/vae_decode_chunks.pyacestep/core/generation/handler/vae_decode_chunks_test.pyacestep/llm_inference.pyacestep/llm_inference_cfg_fixes_test.py
🚧 Files skipped from review as they are similar to previous changes (2)
- acestep/llm_inference_cfg_fixes_test.py
- acestep/core/generation/handler/vae_decode_chunks.py
|
Addressed the sequential batch-item progress concern in commit 014e99e. _run_pt() and the MLX sequential fallback now explicitly close an item's progress band when generation stops early on EOS, so the next item no longer inherits a partially open bar. I added regression coverage in llm_inference_batch_progress_test.py for both paths. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
acestep/core/generation/handler/generate_music_execute.py (1)
42-215: Module length is over the repository hard cap; split or document a follow-up split plan.This file now extends past 200 lines. Please either extract one responsibility slice (for example, runtime progress relay helpers) or add a short PR-note justification with a concrete follow-up split plan.
As per coding guidelines, "PR Readiness: Module LOC policy is met (<=150 target, <=200 hard cap or justified exception)." Based on learnings: "only raise module-size concerns when a file exceeds 200 lines of code (LOC)."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/core/generation/handler/generate_music_execute.py` around lines 42 - 215, The file exceeds the repository hard cap; either split out the runtime progress relay logic or add a PR note with a concrete split plan. Extract the progress-estimation/relay responsibilities (the helper functions and state used by _run_generate_music_service_with_progress) into a new module/class — e.g. move logic around _start_diffusion_progress_estimator, _drain_runtime_progress_events, progress_events queue and _stop_progress_estimator_if_finished into a new helper (ProgressEstimator or progress_relay.py) and have _run_generate_music_service_with_progress call that API; or, if you prefer not to split now, add a short PR description in this PR that documents which slices will be extracted (progress estimator, service wrapper, and timeout handling) and when, and justify the exception for this file size.acestep/llm_inference.py (2)
1183-1185: Bind the per-item state into these loop callbacks.Ruff B023 is right to flag both closures: they capture the loop-local
item_progress_stateby reference. The callbacks are synchronous today, but bindingstate=item_progress_statemakes each callback self-contained and avoids the late-binding footgun.♻️ Suggested change
- def _item_progress(current, total, desc, index=i): - item_progress_state.update(current=current, total=total, desc=desc) + def _item_progress(current, total, desc, index=i, state=item_progress_state): + state.update(current=current, total=total, desc=desc) progress_callback(index * total + current, batch_total * total, desc)- def _item_progress(current, total, desc, index=i, batch_total=batch_size): - item_progress_state.update(current=current, total=total, desc=desc) + def _item_progress( + current, total, desc, index=i, batch_total=batch_size, state=item_progress_state + ): + state.update(current=current, total=total, desc=desc) progress_callback(index * total + current, batch_total * total, desc)Also applies to: 4086-4088
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/llm_inference.py` around lines 1183 - 1185, The closure _item_progress captures the loop-local item_progress_state by reference causing late-binding; update the callback signatures to bind the state into the closure (e.g., def _item_progress(current, total, desc, index=i, state=item_progress_state):) and replace references to item_progress_state inside with state, and do the same binding change for the other callbacks noted (lines ~4086-4088) so each callback closes over its own state rather than the loop variable.
2377-2408: Document the newprogress_callbackcontract on the public entrypoint.
LLMHandler.generate_from_formatted_prompt()now exposes a new callback, but its docstring still describes the old argument set. Please add the callback shape and how callers should interpretcurrent/totalacross backends/batch modes. As per coding guidelines, "Docstrings are mandatory for all new or modified Python modules, classes, and functions. Docstrings must be concise and include purpose plus key inputs/outputs and raised exceptions when relevant."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/llm_inference.py` around lines 2377 - 2408, The docstring for generate_from_formatted_prompt is missing documentation for the new progress_callback parameter; update the function docstring to describe the callback signature (Callable[[int, int, str], None]), explain what each argument means (current: number of items processed so far, total: total items to process or -1 if unknown, message: short status string), and clarify semantic guarantees across backends/batch modes (callbacks are called with per-request progress where total is the batch size for batched/backends that report totals, use -1 when total is not available, and current increments monotonically from 0 to total), also note when it will be invoked (e.g., at start/end and on each token or item depending on backend) and update the Args section to include this new contract under progress_callback for callers to interpret current/total consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@acestep/llm_inference.py`:
- Around line 1447-1450: The Phase 1 progress band isn't closed when metadata
generation is skipped (e.g., when has_all_metas is true or use_cot_metas is
false), so ensure progress(0.3, "LLM metadata generation complete") is called
for both branches; update the block around the logger.info calls (the else
branch that runs when metadata is skipped) to also call progress(0.3, ...) or
move the progress call outside the if/else so
LLMHandler.generate_with_stop_condition() always emits the Phase 1 completion
update regardless of has_all_metas/use_cot_metas.
---
Nitpick comments:
In `@acestep/core/generation/handler/generate_music_execute.py`:
- Around line 42-215: The file exceeds the repository hard cap; either split out
the runtime progress relay logic or add a PR note with a concrete split plan.
Extract the progress-estimation/relay responsibilities (the helper functions and
state used by _run_generate_music_service_with_progress) into a new module/class
— e.g. move logic around _start_diffusion_progress_estimator,
_drain_runtime_progress_events, progress_events queue and
_stop_progress_estimator_if_finished into a new helper (ProgressEstimator or
progress_relay.py) and have _run_generate_music_service_with_progress call that
API; or, if you prefer not to split now, add a short PR description in this PR
that documents which slices will be extracted (progress estimator, service
wrapper, and timeout handling) and when, and justify the exception for this file
size.
In `@acestep/llm_inference.py`:
- Around line 1183-1185: The closure _item_progress captures the loop-local
item_progress_state by reference causing late-binding; update the callback
signatures to bind the state into the closure (e.g., def _item_progress(current,
total, desc, index=i, state=item_progress_state):) and replace references to
item_progress_state inside with state, and do the same binding change for the
other callbacks noted (lines ~4086-4088) so each callback closes over its own
state rather than the loop variable.
- Around line 2377-2408: The docstring for generate_from_formatted_prompt is
missing documentation for the new progress_callback parameter; update the
function docstring to describe the callback signature (Callable[[int, int, str],
None]), explain what each argument means (current: number of items processed so
far, total: total items to process or -1 if unknown, message: short status
string), and clarify semantic guarantees across backends/batch modes (callbacks
are called with per-request progress where total is the batch size for
batched/backends that report totals, use -1 when total is not available, and
current increments monotonically from 0 to total), also note when it will be
invoked (e.g., at start/end and on each token or item depending on backend) and
update the Args section to include this new contract under progress_callback for
callers to interpret current/total consistently.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d0e67b11-76b5-4fa6-b586-c875f044b508
📒 Files selected for processing (4)
acestep/core/generation/handler/generate_music_execute.pyacestep/core/generation/handler/generate_music_execute_thread_test.pyacestep/llm_inference.pyacestep/llm_inference_batch_progress_test.py
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
acestep/core/generation/handler/generate_music_execute.py (1)
20-221: This module is now past the 200 LOC cap.The added queue/drain/emit orchestration pushes this file to 221 lines. Please either extract the runtime-progress relay into a small helper module or add a concrete follow-up split plan in the PR notes.
As per coding guidelines, "If a module would exceed 200 LOC, split by responsibility before merging, or add a short justification in PR notes with a concrete follow-up split plan."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@acestep/core/generation/handler/generate_music_execute.py` around lines 20 - 221, The file exceeds the 200 LOC cap because the runtime progress relay logic was inlined; extract that responsibility into a small helper (or add a follow-up split plan in PR notes). Move the queue/drain/emit orchestration (the _drain_runtime_progress_events function, the progress_events queue, and the progress emitting/locking logic used by _run_generate_music_service_with_progress) into a new helper module/class (e.g., RuntimeProgressRelay with methods drain(), emit_progress(), start_estimator_stop()/stop()). Update _run_generate_music_service_with_progress to instantiate and use this helper (replace direct uses of _drain_runtime_progress_events, progress_events, and _emit_progress with the helper's API) so the handler file stays under 200 LOC; alternatively, if you choose not to split now, add a concise PR note describing the exact refactor steps and target module/class names for a follow-up split.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@acestep/core/generation/handler/generate_music_execute.py`:
- Around line 76-77: The wrapper can exit while gen_thread is still running and
continues enqueuing into progress_events, so in the wrapper's cleanup/finally
you must disable the runtime relay and prevent further enqueues: when exiting
the function that creates progress_events and spawns gen_thread (references:
progress_events, progress_state, gen_thread, service_generate()), call the
runtime relay shutdown/disable API (or set the relay flag) and push a
sentinel/close signal to progress_events, then refrain from draining further
items (or join/stop gen_thread if possible) so the background generation cannot
keep writing into an orphaned queue; ensure this disable/cleanup is executed in
every exit path (including the timeout branch).
---
Nitpick comments:
In `@acestep/core/generation/handler/generate_music_execute.py`:
- Around line 20-221: The file exceeds the 200 LOC cap because the runtime
progress relay logic was inlined; extract that responsibility into a small
helper (or add a follow-up split plan in PR notes). Move the queue/drain/emit
orchestration (the _drain_runtime_progress_events function, the progress_events
queue, and the progress emitting/locking logic used by
_run_generate_music_service_with_progress) into a new helper module/class (e.g.,
RuntimeProgressRelay with methods drain(), emit_progress(),
start_estimator_stop()/stop()). Update _run_generate_music_service_with_progress
to instantiate and use this helper (replace direct uses of
_drain_runtime_progress_events, progress_events, and _emit_progress with the
helper's API) so the handler file stays under 200 LOC; alternatively, if you
choose not to split now, add a concise PR note describing the exact refactor
steps and target module/class names for a follow-up split.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 278fd4d2-d73b-4c5d-8bdc-323b60b2a664
📒 Files selected for processing (5)
acestep/core/generation/handler/diffusion.pyacestep/core/generation/handler/generate_music_execute.pyacestep/core/generation/handler/service_generate.pyacestep/core/generation/handler/service_generate_execute.pyacestep/models/mlx/dit_generate.py
🚧 Files skipped from review as they are similar to previous changes (4)
- acestep/models/mlx/dit_generate.py
- acestep/core/generation/handler/service_generate.py
- acestep/core/generation/handler/diffusion.py
- acestep/core/generation/handler/service_generate_execute.py
Summary
This PR extracts the standalone generation-progress-visibility slice from the original
#808umbrella work. This provides better progress bar tracking of generation processing, long running tasks no longer halt the progress bar until complete, tasks such as token generation and diffusion increment the progress bar as they process to completion. I needed this to correctly integrate the external LM calls into progress for an upcoming step, planning generation during music generation, it turned out to be a nice QOL for users.It improves live progress across active non-
vllm5Hz LM generation, service-side DiT diffusion, and decode-phase reporting, while preserving the coarse estimator only as a fallback when runtime progress is unavailable.This PR is intentionally independent of the external-LM work. It is somewhat wider than ideal because useful generation progress crosses runtime, service, diffusion, and decode layers. To keep it reviewable, broader profile/progress follow-up work from the original PR808 stream is explicitly deferred to later PRs.
What Changed
acestep/core/generation/handler/service_generate.pyacestep/core/generation/handler/service_generate_execute.pyacestep/core/generation/handler/diffusion.pyacestep/core/generation/handler/generate_music_execute.pyvllmLM paths:acestep/llm_inference.pyacestep/core/generation/handler/generate_music_decode.pyacestep/core/generation/handler/vae_decode.pyacestep/core/generation/handler/vae_decode_chunks.pyacestep/core/generation/handler/mlx_vae_decode_native.pyacestep/models/base/modeling_acestep_v15_base.pyacestep/models/sft/modeling_acestep_v15_base.pyacestep/models/turbo/modeling_acestep_v15_turbo.pyacestep/models/mlx/dit_generate.pyacestep/core/generation/handler/service_generate_test.pyacestep/core/generation/handler/service_generate_execute_test.pyacestep/core/generation/handler/generate_music_execute_test.pyacestep/core/generation/handler/generate_music_decode_test.pyacestep/core/generation/handler/vae_decode_mixin_test.pyacestep/llm_inference_cfg_fixes_test.pyacestep/models/mlx/dit_generate_test.pyacestep/models/turbo/modeling_acestep_v15_turbo_progress_test.pyBehavioral Parity
Validation
Executed:
acestep.core.generation.handler.service_generate_testacestep.core.generation.handler.service_generate_execute_testacestep.core.generation.handler.generate_music_execute_testacestep.core.generation.handler.generate_music_decode_testacestep.core.generation.handler.vae_decode_mixin_testacestep.llm_inference_cfg_fixes_testacestep.models.mlx.dit_generate_testacestep.models.turbo.modeling_acestep_v15_turbo_progress_testRan 31 tests ... OKpython3 -m py_compileon touched progress/test modulesScope / Out of Scope
In scope:
vllmLM generation pathsOut of scope:
vllmtoken-level progressmodel.generate()token progressReviewer Focus
generate_music_execute.py.CodeRabbit Scope Guard
Please treat findings as in-scope only when introduced by this progress-visibility patch set.
Summary by CodeRabbit
New Features
Bug Fixes
Tests