Add benchmark results for mistralai/ministral-14b-2512 #196

github-actions · 2025-12-05T00:19:33Z

This PR adds benchmark results for the mistralai/ministral-14b-2512 model.

The following files have been updated:

src/benchmark/results.json - Raw benchmark results
src/benchmark/validation-results.json - Validation results against human baseline

This PR was automatically generated by the benchmark workflow.

Note: If you don't want to merge this PR, close it and the model will be added to the untested list to prevent re-processing.

@alrocar

Note

Adds benchmark results and validation comparisons for mistralai/ministral-14b-2512, and includes the model in the benchmark config.

Benchmark Config:
- Add mistralai model ministral-14b-2512 to src/benchmark-config.json.
Benchmarks:
- Append extensive raw runs for mistralai/ministral-14b-2512 in src/benchmark/results.json across many SQL tasks (queries, attempts, timings, tokens, errors).
Validation:
- Update src/benchmark/validation-results.json with per-task comparisons for mistralai/ministral-14b-2512, including SQL used, row counts, match metrics, and aggregate stats (model shows no matches).

^{Written by Cursor Bugbot for commit eb7a7df. This will update automatically on new commits. Configure here.}

vercel · 2025-12-05T00:19:37Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
llm-benchmark	Ready	Preview	Comment	Dec 5, 2025 0:20am

feat: add benchmark results for mistralai/ministral-14b-2512

eb7a7df

vercel bot deployed to Preview December 5, 2025 00:20 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add benchmark results for mistralai/ministral-14b-2512 #196

Add benchmark results for mistralai/ministral-14b-2512 #196

Uh oh!

github-actions bot commented Dec 5, 2025 •

edited by cursor bot

Loading

Uh oh!

vercel bot commented Dec 5, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add benchmark results for mistralai/ministral-14b-2512 #196

Are you sure you want to change the base?

Add benchmark results for mistralai/ministral-14b-2512 #196

Uh oh!

Conversation

github-actions bot commented Dec 5, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vercel bot commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Dec 5, 2025 •

edited by cursor bot

Loading

vercel bot commented Dec 5, 2025 •

edited

Loading