Skip to content

feat(async): track current running coroutine for graph export#81

Open
16bit-ykiko wants to merge 2 commits intomainfrom
feat/current-node-tracking
Open

feat(async): track current running coroutine for graph export#81
16bit-ykiko wants to merge 2 commits intomainfrom
feat/current-node-tracking

Conversation

@16bit-ykiko
Copy link
Copy Markdown
Member

@16bit-ykiko 16bit-ykiko commented Mar 29, 2026

Summary

  • Add a thread_local async_node* that tracks which coroutine is currently executing on each thread
  • Expose async_node::current() and async_node::dump_current_dot() public API so user code inside any coroutine can export the async graph on demand
  • Set the thread-local via initial_tracking_suspend (first entry), save/restore in resume() / resume_and_drain() (entry points), and direct assignment in handle_subtask_result() / deliver_deferred() / propagate_fail() (symmetric transfer)

Test plan

  • All 673 existing unit tests pass locally
  • CI passes on all platforms

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Runtime introspection: read which async task/node is currently executing and export coroutine graphs in DOT format for debugging and visualization.
  • Bug Fixes

    • Improved tracking during task startup, cancellation and exception propagation so debugger/diagnostics report the correct active task context.

…ph export

Add a thread-local async_node* that always points to the coroutine
currently executing on this thread. This allows calling
async_node::current() or async_node::dump_current_dot() from anywhere
inside a coroutine body to export the async graph on demand.

Tracking points:
- initial_tracking_suspend sets the node on first coroutine entry
- async_node::resume() and resume_and_drain() save/restore around resumes
- handle_subtask_result(), deliver_deferred(), propagate_fail() set the
  node before returning a handle for symmetric transfer

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 29, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 644ac73e-8f85-432c-88a6-6115e4ebda03

📥 Commits

Reviewing files that changed from the base of the PR and between bc046c6 and 7a64f24.

📒 Files selected for processing (3)
  • include/eventide/async/runtime/frame.h
  • src/async/runtime/frame.cpp
  • src/async/runtime/sync.cpp

📝 Walkthrough

Walkthrough

Added thread-local tracking of the currently executing async_node and plumbing to set/restore it across coroutine resume/propagation. detail::resume_and_drain now accepts a restore target, task initial suspend sets the node on first resume, and APIs allow querying/dumping the current node.

Changes

Cohort / File(s) Summary
Frame header
include/eventide/async/runtime/frame.h
Changed detail::resume_and_drain signature to accept async_node* restore_to. Added async_node::current() noexcept, async_node::dump_current_dot(), and detail::set_current_node() / detail::current_node() declarations.
Task header
include/eventide/async/runtime/task.h
Added initial_tracking_suspend awaitable and changed task_promise_object::initial_suspend to use it; propagate_fail now calls detail::set_current_node(parent_node) before returning parent handle.
Frame implementation
src/async/runtime/frame.cpp
Introduced thread-local current_running_node, implemented detail::set_current_node/detail::current_node, async_node::current, and async_node::dump_current_dot. Updated detail::resume_and_drain to accept restore_to and integrated save/restore of the running node around resume/propagation sites.
Sync primitive
src/async/runtime/sync.cpp
Capture current node before resuming waiters and call detail::resume_and_drain(prev, next) to restore node context after resume.

Sequence Diagram(s)

sequenceDiagram
    participant Thread as Thread-Local
    participant Detail as detail API
    participant Node as async_node
    participant Coro as Coroutine

    rect rgba(120,180,240,0.5)
    Note over Thread,Detail: Resume entry with node tracking
    Thread->>Node: save prev = current_node()
    Node->>Detail: set_current_node(this)
    Detail->>Thread: update thread-local
    Node->>Coro: resume()
    Coro->>Detail: query current_node()
    Detail->>Coro: return this
    Coro-->>Node: returns/finishes
    Node->>Thread: detail::set_current_node(prev)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐇 I hopped into threads where coros run,

I mark each node beneath the sun,
Save, resume, then set it back,
A tidy trail upon the track,
Hooray — the runtime knows who's on!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 34.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat(async): track current running coroutine for graph export' directly and clearly describes the main change: adding thread-local tracking of the currently executing coroutine to enable graph export functionality.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/current-node-tracking

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/async/runtime/frame.cpp (1)

66-75: ⚠️ Potential issue | 🔴 Critical

resume_and_drain() now captures the wrong restore point.

This helper assumes current_running_node still holds the outer context on entry, but the new pre-transfer writes in handle_subtask_result() / deliver_deferred() overwrite it before callback paths such as system_op::complete() reach Line 68. The thread can then leave the callback still pointing at the resumed task instead of the real outer context, and if that task finishes in the same chain drain_pending_destroys() can leave TLS dangling.

One way to make the restore point explicit
-void detail::resume_and_drain(std::coroutine_handle<> handle) {
+void detail::resume_and_drain(async_node* restore_to, std::coroutine_handle<> handle) {
     if(handle) {
-        auto* prev = current_running_node;
         handle.resume();
-        current_running_node = prev;
+        current_running_node = restore_to;
     }
 `#if` ETD_WORKAROUND_MSVC_COROUTINE_ASAN_UAF
     drain_pending_destroys();
 `#endif`
 }

The callback entry points then need to capture detail::current_node() before they call handle_subtask_result() or deliver_deferred(), and pass that saved value through.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/async/runtime/frame.cpp` around lines 66 - 75, resume_and_drain()
currently saves the restore point from the global TLS current_running_node too
late and can be clobbered by pre-transfer writes in
handle_subtask_result()/deliver_deferred(), causing TLS to remain pointing at
the resumed task; change resume_and_drain to accept an explicit previous node
parameter (e.g., resume_and_drain(detail::node* prev, std::coroutine_handle<>
handle) or similar) and restore that value instead of reading
current_running_node inside the function, and update all call sites (notably
places that invoke resume_and_drain() via system_op::complete and other callback
paths) so they capture detail::current_node() before calling
handle_subtask_result()/deliver_deferred() and pass that saved node through to
resume_and_drain; keep drain_pending_destroys() behavior unchanged.
🧹 Nitpick comments (1)
include/eventide/async/runtime/frame.h (1)

125-131: Make current() read-only.

The new API is for inspection, but async_node* lets callers mutate state, call cancel(), or resume() on the live runtime node. Returning const async_node* keeps the graph-export/debugging use case without widening the mutation surface.

Safer public signature
-    static async_node* current() noexcept;
+    static const async_node* current() noexcept;

dump_dot() is already const, so the export path still works unchanged.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/eventide/async/runtime/frame.h` around lines 125 - 131, current()
currently returns a mutable async_node* allowing callers to mutate runtime
state; change its signature to return const async_node* noexcept so callers only
get a read-only view (update the declaration of static async_node* current()
noexcept to static const async_node* current() noexcept and ensure any callers
that relied on a mutable pointer are adjusted), leaving dump_current_dot() and
async_node::dump_dot() unchanged since dump_dot() is const.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/async/runtime/frame.cpp`:
- Around line 66-75: resume_and_drain() currently saves the restore point from
the global TLS current_running_node too late and can be clobbered by
pre-transfer writes in handle_subtask_result()/deliver_deferred(), causing TLS
to remain pointing at the resumed task; change resume_and_drain to accept an
explicit previous node parameter (e.g., resume_and_drain(detail::node* prev,
std::coroutine_handle<> handle) or similar) and restore that value instead of
reading current_running_node inside the function, and update all call sites
(notably places that invoke resume_and_drain() via system_op::complete and other
callback paths) so they capture detail::current_node() before calling
handle_subtask_result()/deliver_deferred() and pass that saved node through to
resume_and_drain; keep drain_pending_destroys() behavior unchanged.

---

Nitpick comments:
In `@include/eventide/async/runtime/frame.h`:
- Around line 125-131: current() currently returns a mutable async_node*
allowing callers to mutate runtime state; change its signature to return const
async_node* noexcept so callers only get a read-only view (update the
declaration of static async_node* current() noexcept to static const async_node*
current() noexcept and ensure any callers that relied on a mutable pointer are
adjusted), leaving dump_current_dot() and async_node::dump_dot() unchanged since
dump_dot() is const.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c1c8aa54-d09a-4fc9-b10a-ab8b2eb18bc3

📥 Commits

Reviewing files that changed from the base of the PR and between 37dc1ed and bc046c6.

📒 Files selected for processing (3)
  • include/eventide/async/runtime/frame.h
  • include/eventide/async/runtime/task.h
  • src/async/runtime/frame.cpp

Comment on lines +68 to +84
// ============================================================================
// initial_tracking_suspend — sets current node when coroutine body first runs
// ============================================================================

struct initial_tracking_suspend {
async_node* node;

bool await_ready() const noexcept {
return false;
}

void await_suspend(std::coroutine_handle<>) const noexcept {}

void await_resume() const noexcept {
detail::set_current_node(node);
}
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Tracking is lost after co_awaiting pass-through awaitables.

initial_tracking_suspend only seeds TLS on the first resume. Since task_promise_object::await_transform() still forwards arbitrary awaitables unchanged at Lines 367-371, any awaitable that later resumes this coroutine via a raw handle bypasses the new bookkeeping and async_node::current() becomes null/stale in the continued body. Either narrow the contract to Eventide-managed awaitables or route external awaits through a tracked adapter.

- resume_and_drain now takes an explicit restore_to parameter so the
  correct previous node is restored even after handle_subtask_result /
  deliver_deferred overwrite the thread-local
- async_node::current() returns const async_node* to prevent mutation
  of the live runtime node through the diagnostic API

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@16bit-ykiko
Copy link
Copy Markdown
Member Author

Re: pass-through awaitables losing tracking

This is an intentional limitation. All of Eventide's own awaitables (tasks, sync primitives, IO ops, aggregates) go through the tracked resume paths (handle_subtask_result, deliver_deferred, async_node::resume, resume_and_drain), so async_node::current() is correct for all framework-managed operations.

Third-party awaitables that resume a coroutine via a raw coroutine_handle<> bypass the bookkeeping — this is a known trade-off. Wrapping arbitrary awaitables in await_transform was attempted but conflicts with move-only types like when_all/when_any. Will add a note to the current() doc comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant