This is the GitHub repository of "AdvisorQA: Towards Helpful and Harmless Advice-seeking Question Answering with Collective Intelligence", accepted at NAACL 2025. This paper mainly discusses the hurdles to progress in subjective QA, mainly in post-processing (alignment).
AdvisorQA dataset is in "[data link]". If you download it as JSON files, move it to the 'data' directory for post-training: SFT, DPO, and PPO.
Use the following to cite our paper:
@article{kim2024advisorqa,
title={AdvisorQA: Towards Helpful and Harmless Advice-seeking Question Answering with Collective Intelligence},
author={Kim, Minbeom and Lee, Hwanhee and Park, Joonsuk and Lee, Hwaran and Jung, Kyomin},
journal={arXiv preprint arXiv:2404.11826},
year={2024}
}
