-
Notifications
You must be signed in to change notification settings - Fork 27
Open
Description
Question from @MarinaMancoridis
I was wondering whether you happen to have access to graded or annotated model responses for the dataset (ie. per-question correctness for specific models such as GPT-4/5, etc.). In particular, I’m curious whether question-level performance labels across models are available or were collected during your experiments.
Yes it is here: https://huggingface.co/datasets/bigcode/evaluation :)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels