Long-Running Tool FunctionResponse Routing Bypasses Custom Agent Orchestration

## Summary

When a sub-agent within a custom orchestrator agent calls a `LongRunningFunctionTool`, the `FunctionResponse` is routed directly to the sub-agent, bypassing the parent custom agent's `_run_async_impl` method entirely. This prevents custom agents from implementing proper workflow orchestration and continuation logic after long-running tool pauses.

## Our Architecture

We have a custom orchestrator agent (`DomainAgent`) that manages multi-step workflows:

```
DomainAgent (custom orchestrator)
├── Router Agent (LlmAgent)
├── Planner Agent (LlmAgent)
└── Step Executor Agent (LlmAgent) ← calls long-running tool
```

**Flow:**
1. `DomainAgent._run_async_impl()` orchestrates the workflow
2. Calls router → planner → executes multiple steps sequentially
3. Step Executor agent calls `ask_clarifications` (LongRunningFunctionTool)
4. Frontend collects user input and sends `FunctionResponse`
5. **Problem**: Runner routes directly to Step Executor, never re-enters `DomainAgent._run_async_impl()`

## The Issue

In `runners.py`, the `_find_agent_to_run()` method (line 1003-1055) determines which agent should handle the next message:

```python
def _find_agent_to_run(self, session: Session, root_agent: BaseAgent) -> BaseAgent:
    # If the last event is a function response, send to agent that made the function call
    event = find_matching_function_call(session.events)
    if event and event.author:
      return root_agent.find_agent(event.author)  # ← Returns sub-agent directly
    # ... rest of logic
```

**What happens:**
1. Step Executor calls long-running tool → event.author = "step_executor"
2. Frontend sends FunctionResponse via `run_async(invocation_id=..., new_message=FunctionResponse)`
3. `_find_agent_to_run()` finds the function call event with author="step_executor"
4. Returns Step Executor agent directly
5. Runner calls `step_executor.run_async()` 
6. **DomainAgent's orchestration logic is completely bypassed**

## Expected Behavior

The `DomainAgent` should be able to:
1. Detect that a FunctionResponse is being processed
2. Update workflow state
3. Continue orchestration (e.g., move to next workflow step)
4. Properly manage the entire workflow lifecycle


## Questions

1. **Is this the intended behavior?** Should long-running tools always route directly to the calling agent?
2. **Is there a supported way** for custom orchestrator agents to intercept FunctionResponses for sub-agents?
3. **Could ADK provide a hook** (e.g., `can_handle_sub_agent_function_response()`) to let parent agents opt-in to handling their sub-agents' function responses?


## Environment

- ADK Version: [Latest from main]
- Python Version: 3.12
- Agent Type: Custom BaseAgent subclass with LlmAgent sub-agents

## Additional Context

This limitation makes it difficult to implement:
- Multi-step workflows with human-in-the-loop
- Orchestrator agents that manage sub-agent lifecycles
- Complex state machines where parent agents coordinate sub-agent interactions
- Resume logic that requires parent-level context



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Long-Running Tool FunctionResponse Routing Bypasses Custom Agent Orchestration #3986

Summary

Our Architecture

The Issue

Expected Behavior

Questions

Environment

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Long-Running Tool FunctionResponse Routing Bypasses Custom Agent Orchestration #3986

Description

Summary

Our Architecture

The Issue

Expected Behavior

Questions

Environment

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions