Skip to content

Conversation

@jgieringer
Copy link
Collaborator

Description

Conversation ID support and naming

  • Naming: Renamed the run-level ordinal from conversation_id to conversation_index in the runner and tests. It remains the 1-based index of a conversation within a run; the result dict still uses the key "id" for compatibility.
  • Conversation ID (per-conversation): Added support for a unique, per-conversation identifier used by LLM clients (e.g. for API thread/session tracking):
    • LLMInterface: conversation_id (initially None), create_conversation_id(), and ensure_conversation_id(). Clients set conversation_id from response metadata or a generated id; they call ensure_conversation_id() at the end of generate_response() and use self.conversation_id when calling their API.
    • Runner: The shared agent’s conversation_id is reset to None at the start of each run_single_conversation() so every conversation gets a new provider id.
    • Simulator: No conversation_id handling; it only calls generate_response(conversation_history).
  • Response metadata: Replaced the get_last_response_metadata() API with a last_response_metadata property (backed by _last_response_metadata) so reading it always returns a copy and callers don’t need to remember .copy(). Implementations that mutate metadata in place use _last_response_metadata; the setter still accepts full dicts via self.last_response_metadata = {...}.
  • Docs: Updated docs/evaluating.md with a “Conversation flow and history” section and clarified how to use last_response_metadata, conversation_id, and ensure_conversation_id() when implementing a custom LLM client. Confirmed that generate_structured_response (judge path) does not use or require conversation_id.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class support for a per-conversation unique identifier (conversation_id) on LLMInterface, while also clarifying/renaming the runner’s run-level ordinal to conversation_index and standardizing response metadata access via a last_response_metadata property.

Changes:

  • Introduces LLMInterface.conversation_id, plus create_conversation_id() / ensure_conversation_id(), and converts response metadata access from get_last_response_metadata() to last_response_metadata.
  • Updates built-in LLM clients + tests to use the new metadata property and to call ensure_conversation_id() after responses/errors.
  • Renames runner/test parameter conversation_idconversation_index while keeping result key "id" for compatibility; updates docs accordingly.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
llm_clients/llm_interface.py Adds conversation_id + metadata property and helper methods.
llm_clients/openai_llm.py Switches to _last_response_metadata for in-place updates; calls ensure_conversation_id().
llm_clients/azure_llm.py Same as above for Azure client, including error paths.
llm_clients/claude_llm.py Same as above for Claude client.
llm_clients/gemini_llm.py Same as above for Gemini client.
llm_clients/ollama_llm.py Calls ensure_conversation_id() and removes old metadata getter.
generate_conversations/runner.py Renames conversation_idconversation_index; resets agent.conversation_id per conversation.
generate_conversations/conversation_simulator.py Uses last_response_metadata property for logging metadata.
tests/unit/llm_clients/test_* Updates tests to use last_response_metadata and renames copy-behavior tests.
tests/unit/llm_clients/test_llm_interface.py Adds unit coverage for conversation_id helpers.
tests/mocks/mock_llm.py Removes old metadata getter; calls ensure_conversation_id().
tests/integration/test_conversation_runner.py Renames conversation_idconversation_index in integration tests.
docs/evaluating.md Documents the new metadata property and conversation_id flow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +66 to +67
# Reset shared agent's conversation_id so any server-side conversations clear
agent.conversation_id = None
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agent is shared across all conversations in run_conversations() and conversations are executed concurrently via asyncio.gather. Resetting agent.conversation_id here introduces a race where parallel conversations will clobber each other’s conversation_id (and any other per-conversation state), which defeats the goal of per-conversation IDs and can break any client that uses conversation_id for server-side threads/sessions. Consider instantiating a separate agent per conversation (move LLMFactory.create_llm(...) into run_single_conversation or clone the agent), or otherwise make conversation_id and metadata state per-simulator/per-task rather than stored on a shared agent instance.

Suggested change
# Reset shared agent's conversation_id so any server-side conversations clear
agent.conversation_id = None

Copilot uses AI. Check for mistakes.
Comment on lines +38 to +41
@property
def last_response_metadata(self) -> Dict[str, Any]:
"""Metadata from the last generate_response call. Returns a copy."""
return self._last_response_metadata.copy()
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last_response_metadata returns self._last_response_metadata.copy(), which is only a shallow copy. Nested values (e.g., the inner usage dict) are still shared, so callers can inadvertently mutate internal state via metadata["usage"][...] = ... despite the docs implying reads are safe. If you want to guarantee callers can’t mutate any part of stored metadata, return a deep copy (e.g., copy.deepcopy) or otherwise freeze/nest-copy known mutable subfields.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant