AgReasoning Benchmark

"AgReasoning Benchmark", which introduces a large-scale question-answering (QA) benchmark tailored to the agricultural domain.

📄 Paper Summary

Goal: Benchmark LLMs and reasoning models on domain-specific agronomic QA tasks.
Dataset: 55K expert-in-the-loop QA pairs covering diverse agricultural question categories.
Key Contributions:
- A multi-stage flowchart-driven pipeline for dataset curation.
- Evaluation framework using LLM-as-a-Judge.
- A distilled model that matches larger models in performance with higher efficiency.

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
static		static
.nojekyll		.nojekyll
README.md		README.md
index.html		index.html
rest.html		rest.html