Skip to content

(EAI-1237): NL2AtlasSearch prompt optimization + claude code optimization#3

Merged
mongodben merged 51 commits intomainfrom
EAI-1237
Oct 10, 2025
Merged

(EAI-1237): NL2AtlasSearch prompt optimization + claude code optimization#3
mongodben merged 51 commits intomainfrom
EAI-1237

Conversation

@mongodben
Copy link
Copy Markdown
Collaborator

@mongodben mongodben commented Oct 7, 2025

Jira: https://jira.mongodb.org/browse/EAI-1237

Changes

  • NL2AtlasSearch prompt optimization
    • Added 'maximalist' prompt with thorough guidance
    • Added 'optimized' prompt with guidance for maximum benchmark performance
  • Claude Code repo set up (you can reasonably ignore this in the review)
  • Bug fixes to benchmark CLI

Notes

[
   // maximal prompt
  {
    "name": "nl_to_atlas_search?experimentType=agentic_prompt_maximal_recommendation&model=gpt-5&datasets=simple_english_wikipedia",
    "eXNeON": 0.8574313418984933,
    "NDCG@10": 0.7086369322198238,
    "NonEmptyArrayOutput": 0.891156462585034,
    "SearchOperatorUsed": 0.9319727891156463,
    "SuccessfulExecution": 0.8979591836734694,
    "num_examples": "153",
    "error_rate": 0,
    "num_errors": "5",
    "duration": 250.30080919830422,
    "llm_duration": 25.0676878828614,
    "prompt_tokens": 56558,
    "completion_tokens": 16582,
    "total_tokens": 73140,
    "last_updated": 1760040236675,
    "metadata": {
      "task": "agentic_prompt_maximal_recommendation",
      "model": "gpt-5",
      "dataset": "simple_english_wikipedia"
    },
    "id": "39cc52a9-a5c6-492e-add0-e7aa2a08bcbb",
    "Model": "gpt-5",
    "Task": "agentic_prompt_maximal_recommendation"
  },
    // simple prompt
    {
    "name": "nl_to_atlas_search?experimentType=agentic&model=gpt-5&datasets=simple_english_wikipedia-5c9789a3",
    "eXNeON": 0.8434604329994794,
    "NDCG@10": 0.7275832285965571,
    "NonEmptyArrayOutput": 0.8503401360544217,
    "SearchOperatorUsed": 0.891156462585034,
    "SuccessfulExecution": 0.9047619047619048,
    "null": null,
    "num_examples": "153",
    "error_rate": 0,
    "num_errors": "5",
    "duration": 187.33616447919295,
    "llm_duration": 21.070697993086004,
    "prompt_tokens": 39163,
    "completion_tokens": 11531,
    "total_tokens": 50694,
    "last_updated": 1760039408254,
    "metadata": {
      "task": "agentic",
      "model": "gpt-5",
      "dataset": "simple_english_wikipedia"
    },
    "description": null,
    "id": "fd5350f4-cd3f-4ffc-9958-1bedd804cc77",
    "dataset": null,
    "tags": null,
    "Model": "gpt-5",
    "Task": "agentic"
  },
  // optimal prompt
  {
    "name": "nl_to_atlas_search?experimentType=agentic_prompt_optimized_recommendation&model=gpt-5&datasets=simple_english_wikipedia-a79dff15",
    "eXNeON": 0.8805962495947711,
    "NDCG@10": 0.7475505612929915,
    "NonEmptyArrayOutput": 0.9139072847682119,
    "SearchOperatorUsed": 0.9403973509933775,
    "SuccessfulExecution": 0.9205298013245033,
    "null": null,
    "num_examples": "153",
    "error_rate": 0,
    "num_errors": "1",
    "duration": 169.67061183170267,
    "llm_duration": 18.546299081233357,
    "prompt_tokens": 42477,
    "completion_tokens": 13022,
    "total_tokens": 55499,
    "last_updated": 1760037818891,
    "metadata": {
      "task": "agentic_prompt_optimized_recommendation",
      "model": "gpt-5",
      "dataset": "simple_english_wikipedia"
    },
    "description": null,
    "id": "f26e22c5-bee0-40bf-927f-5b800f7cf665",
    "dataset": null,
    "tags": null,
    "Model": "gpt-5",
    "Task": "agentic_prompt_optimized_recommendation"
  }
]

@mongodben mongodben changed the title Eai 1237 (EAI-1237): NL2AtlasSearch prompt optimization Oct 7, 2025
@mongodben mongodben changed the title (EAI-1237): NL2AtlasSearch prompt optimization (EAI-1237): NL2AtlasSearch prompt optimization + claude code optimization Oct 9, 2025
@mongodben mongodben marked this pull request as ready for review October 9, 2025 18:47
Copy link
Copy Markdown
Collaborator

@hschawe hschawe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've got a couple questions before approving

You may use the available tools to help you explore the database, generate the query, think about the problem, and submit the final solution.
const tools = `<tools>

<tool name="${thinkToolName}">
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you noticed any performance loss in tool calling for non-gpt models after removing the tool descriptions here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i haven't no, but also havent specifically looked into that

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the "tool instructions in tool description" guidance came from openai, i'm concerned that this change could cause performance losses in non-gpt models that's unrelated to the "mongodb knowledge" of those models. something to keep in mind when running these benchmarks in the future

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea valid point. worth measuring (in the future)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like we could be stacking the deck in openai's favor w/ this approach

@@ -131,7 +131,6 @@ export async function makeMongoDbMcpAgent({
mcpToolSet[thinkToolName] = thinkTool;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think we should exclude this when using reasoning models?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, i think it sohuld be kept 1. for consistency and 2. b/c the reasoning model might still derive value from explicitly writing thoughts out (this was mentioned as useful in a blog post i read)

systemPrompt: atlasSearchAgentPromptWithOptimizedRecommendation,
maxSteps: ATLAS_SEARCH_AGENT_MAX_STEPS,
mongoClient,
mongoDbMcpClient: mcpClient,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick-y but consider renaming mcpClient as mongoDbMcpClient for simplicity

@mongodben mongodben merged commit a6bf10a into main Oct 10, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants