PRAISE: Preference-based Learning with Retrieval Augmented Generation for Conversational Question Answering
This repository contains the code and data for our WWW'25 short paper. Our method PRAISE (Preference-based Learning with Retrieval Augmented Iterative SEquence generation for ConvQA), is a pipeline architecture, consisting of question understanding (QU), evidence retrieval and filtering (ERF) and answer generation (AG). We train LLMs for each of the three subtasks. Since labeled training data for individual subtasks is unavailable in practice, PRAISE learns from its own generations using the final answering performance as feedback. More precisely, PRAISE samples generations from an initial model and learns from pairing successful and unsuccessful generations using Direct Preference Optimization (DPO).
Overview and illustration of PRAISE (preferred/correct outputs are in blue, incorrect/uninformative outputs in red).
For more details see our paper: Preference-based Learning with Retrieval Augmented Generation for Conversational Question Answering.
If you use this code, please cite:
@article{kaiser2025preference,
title={Preference-based Learning with Retrieval Augmented Generation for Conversational Question Answering},
author={Kaiser, Magdalena and and Weikum, Gerhard},
booktitle={WWW},
year={2025}
}We conduct experiments on ConvMix, a ConvQA benchmark over heterogeneous sources.
All code was tested on Linux with Python 3.12.
To install the required libraries, it is recommended to create a virtual environment:
# using pip
python3 -m venv PRAISE_ENV
source PRAISE_ENV/bin/activate
pip install -r requirements.txt
# using conda
conda create --name PRAISE_ENV --file requirements.txt
conda activate PRAISE_ENV
To initialize the repo (download data, benchmark, models), run:
bash initialize.sh
For the retrieval, we make use of EXPLAIGNN and CLOCQ:
- Create a separate virtual environment for EXPLAIGNN:
cd EXPLAIGNN/
conda env create --file conda-explaignn.yml
conda activate explaignn
pip install -e .
- Integrate CLOCQ via the publicly available API, using the client from the repo, it can be installed via:
bash make install_clocq
- Initialize EXPLAIGNN, run (inside the EXPLAIGNN directory):
bash scripts/initialize.sh
Run the trained praise pipeline in inference mode:
bash scripts/run_pipeline.sh --inference configs/pipeline_config.yml GPU GPU_NUM
with GPU indicating the respective GPU used for running PRAISE and GPU_NUM the type and number of gpus to use (e.g. A100:2)
For training, run:
bash scripts/run_COMPONENT.sh configs/COMPONENT_train_config.yml GPU GPU_NUM
where COMPONENT can be either qu (Question Understanding), erf (Evidence Retrieval and Filtering) or ag (Answer Generation).
