-
Notifications
You must be signed in to change notification settings - Fork 6
Feat/generate multiple answers #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
87a552a
1ce205b
0f6d5ed
e8f4a31
9efa26a
7dfa213
b5dbef9
4e5af81
932750c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,9 +1,29 @@ | ||||||||||
| from typing import Any, Dict, List, Optional, Set | ||||||||||
| from typing import Any, Dict, List, Optional, Set, Tuple | ||||||||||
|
|
||||||||||
| from pydantic import BaseModel | ||||||||||
|
|
||||||||||
| from llm_clients import LLMInterface | ||||||||||
| from utils.conversation_utils import save_conversation_to_file | ||||||||||
|
|
||||||||||
|
|
||||||||||
| class ScoredResponse(BaseModel): | ||||||||||
| """A single response with its probability score.""" | ||||||||||
|
|
||||||||||
| text: str | ||||||||||
| probability: float | ||||||||||
|
|
||||||||||
|
|
||||||||||
| class ResponseWithScores(BaseModel): | ||||||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd probably have called this |
||||||||||
| """Model for multiple responses with confidence scores. | ||||||||||
|
|
||||||||||
| Note: Uses nested Pydantic model instead of List[Tuple[str, float]] | ||||||||||
| because OpenAI's structured output API doesn't support tuple types in | ||||||||||
| JSON schema. Tuples must be converted to objects with named fields. | ||||||||||
| """ | ||||||||||
|
|
||||||||||
| responses: List[ScoredResponse] | ||||||||||
|
|
||||||||||
|
|
||||||||||
| class ConversationSimulator: | ||||||||||
| """Simulates a conversation between two LLM instances.""" | ||||||||||
|
|
||||||||||
|
|
@@ -15,16 +35,6 @@ def __init__(self, persona: LLMInterface, agent: LLMInterface): | |||||||||
| # Define termination signals that indicate persona wants to end the conversation | ||||||||||
| self.termination_signals: Set[str] = set() | ||||||||||
|
|
||||||||||
| # "goodbye", "bye", "farewell", "talk to you later", | ||||||||||
| # "ttyl", | ||||||||||
| # "end conversation", "conversation over", "that's all", | ||||||||||
| # "nothing more to discuss", | ||||||||||
| # "i'm done", "let's end here", | ||||||||||
| # "conversation complete", "wrapping up", "final thoughts", | ||||||||||
| # "concluding", "to conclude", | ||||||||||
| # "in conclusion" | ||||||||||
| # } | ||||||||||
|
|
||||||||||
| def _should_terminate_conversation( | ||||||||||
| self, response: str, speaker: LLMInterface | ||||||||||
| ) -> bool: | ||||||||||
|
|
@@ -44,13 +54,7 @@ def _should_terminate_conversation( | |||||||||
| return True | ||||||||||
|
|
||||||||||
| # Check for common ending patterns | ||||||||||
| ending_patterns = [ | ||||||||||
| # "it was nice", | ||||||||||
| # "pleasure talking", | ||||||||||
| # "great conversation", | ||||||||||
| # "good chat", | ||||||||||
| # "until next time" | ||||||||||
| ] | ||||||||||
| ending_patterns = [] | ||||||||||
|
|
||||||||||
| for pattern in ending_patterns: | ||||||||||
| if pattern in response_lower: | ||||||||||
|
|
@@ -63,6 +67,7 @@ async def start_conversation( | |||||||||
| max_turns: int, | ||||||||||
| initial_message: Optional[str] = None, | ||||||||||
| max_total_words: Optional[int] = None, | ||||||||||
| multiple_responses: bool = False, | ||||||||||
| ) -> List[Dict[str, Any]]: | ||||||||||
| """ | ||||||||||
| Start a conversation between the two LLMs with early stopping support. | ||||||||||
|
|
@@ -72,7 +77,8 @@ async def start_conversation( | |||||||||
| initial_message: Optional initial message (for the first speaker) | ||||||||||
| to start the conversation. By default, first speaker is persona. | ||||||||||
| max_total_words: Optional maximum total words across all responses | ||||||||||
|
|
||||||||||
| multiple_responses: If True, generate multiple responses with scores | ||||||||||
| and select the highest-scored one. Requires JudgeLLM support. | ||||||||||
|
|
||||||||||
| Returns: | ||||||||||
| List of conversation turns with speaker and message | ||||||||||
|
|
@@ -90,20 +96,63 @@ async def start_conversation( | |||||||||
| # Record start time for this turn | ||||||||||
|
|
||||||||||
| # Generate response | ||||||||||
| response = await current_speaker.generate_response(current_message) | ||||||||||
| response: str | ||||||||||
| score: Optional[float] | ||||||||||
| all_responses: Optional[List[Tuple[str, float]]] | ||||||||||
|
Comment on lines
+100
to
+101
|
||||||||||
| score: Optional[float] | |
| all_responses: Optional[List[Tuple[str, float]]] | |
| score: Optional[float] = None | |
| all_responses: Optional[List[Tuple[str, float]]] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you forked this to a whole different persona prompt for mutliple_responses == True, you could incorporate this in the rest of the prompt file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused... are we... not giving the member prompt the conversation history? I assumed yes, but it seems like only the current_message (the most recent response from the other side) gets included?
If we're not giving the member the chat history, and only giving it the most recent provider response... I can see how that would make it harder for it to have realistic conversations...
Copilot
AI
Jan 5, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment is unclear and imprecise. The phrase 'mostly a text string' is ambiguous. Clarify the intent or remove if the comment doesn't add value.
| # response is mostly a text string | |
| # Count the number of words in the LLM response |
Uh oh!
There was an error while loading. Please reload this page.