Skip to content

Add Long-Running Responses API Agent Template#146

Open
david-tempelmann wants to merge 5 commits intodatabricks:mainfrom
david-tempelmann:long-running-agent
Open

Add Long-Running Responses API Agent Template#146
david-tempelmann wants to merge 5 commits intodatabricks:mainfrom
david-tempelmann:long-running-agent

Conversation

@david-tempelmann
Copy link

@david-tempelmann david-tempelmann commented Mar 3, 2026

  • Adds agent-openai-agents-sdk-long-running-agent template for long-running agent queries (minutes instead of seconds).
  • Background mode: Two flows: (1) Background + Poll – POST with background: true returns immediately; client polls GET until completion. (2) Background + Stream – POST with stream: true, background: true returns an SSE stream; if the connection drops, client resumes via GET /responses/{id}?stream=true&starting_after=N to receive remaining events from sequence N+1.
  • Persistence: Lakebase (PostgreSQL) stores stream events so clients can resume or poll results.
  • LongRunningAgentServer: Extends MLflow AgentServer with background mode and retrieve endpoints.
  • Compatible with Reponses APIs Background mode (except for cancelling a background response)
  • demo_long_running_agent.py script to demonstrate how to interact with the agent using the OpenAI agents sdk. The script uses a short and a long dummy query for demo purposes. The long query is supposed to run beyond the 120 second timeout to demonstrate stream resumption.

@david-tempelmann
Copy link
Author

@bbqiu This is my PR. The current e2e-chatbot-app-next won't work with this agent and would require some changes. The corresponding client contract is defined in the README.md

@bbqiu bbqiu self-requested a review March 3, 2026 09:46
Copy link
Contributor

@bbqiu bbqiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks great! i'll go over this again tmrw to fix some small things after comments are addressed!



@invoke()
async def invoke(request: ResponsesAgentRequest) -> ResponsesAgentResponse:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit to rename to invoke_handler / stream_handler

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be able to just steal this file from the openai agents SDK from main btw

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.



@stream()
async def stream(request: dict) -> AsyncGenerator[ResponsesAgentStreamEvent, None]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit to fix this type hint

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def _sse_event(event_type: str, data: dict[str, Any] | str) -> str:
"""Format an SSE event per Open Responses spec: event must match type in body."""
payload = data if isinstance(data, str) else json.dumps(data)
return f"event: {event_type}\ndata: {payload}\n\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooc, how did the frontend client handle this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not change anything in addition to what I initially implemented to make background mode work. It still worked but I would need to check in detail how the frontend handles them.

last_output_index: int = -1


def _normalize_stream_event(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah were these the restrictions we had to get around to make it work with the .stream from the responses client? if so, we can maybe drop these requirements for now, as this seems a tad brittle

needing to remap output_index etc. is quite unfortunate, and it's a bit confusing that the openai-agents sdk doesn't produce output that is compatible w/ the client itself

"""
super()._setup_routes()

# TODO: check because I don't think we need pghost ... just the LAKEBASE_INSTANCE_NAME
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as an FYI the frontend template requires pghost for the stateful chats

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack. But that requirement should not be handled/checked in for the agent server I guess? I just simplified the warning message and removed the TODO.

f41fa1d

}

if is_streaming:
asyncio.create_task(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we have a default timeout that's configurable of 30 min? just so stuff doesn't run forever


#### Implementing with the OpenAI SDK

```mermaid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think there's a syntax error with this mermaid diagram

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants