This is the repository for AAAI 2025 Paper: Mind the Uncertainty in Human Disagreement: Evaluating Discrepancies Between Model Predictions and Human Responses in VQA
- Clone the repository:
git clone https://github.com/mainlp/vqa-hud.git cd vqa-hud - Prepare the dataset and base models: Download the dataset VQA 2.0 Follow the LXMERT and BEIT3, and fine-tune the provided checkpoints.
- [βοΈ] script for HUD scores
- [βοΈ] script for Evaluation
- Prepare the dataset and base models: Run:
python HUD_score.py
python split_hud.py --ascending
to get the hud scores and set splits.You can find all the evaluation functions in evaluation.py to implement any customized data evaluations.
@article{Lan_Frassinelli_Plank_2025, title={Mind the Uncertainty in Human Disagreement: Evaluating Discrepancies Between Model Predictions and Human Responses in VQA}, volume={39}, url={https://ojs.aaai.org/index.php/AAAI/article/view/32468}, DOI={10.1609/aaai.v39i4.32468}, number={4}, journal={Proceedings of the AAAI Conference on Artificial Intelligence}, author={Lan, Jian and Frassinelli, Diego and Plank, Barbara}, year={2025}, month={Apr.}, pages={4446-4454} }