Active Task Disambiguation

This repository contains the code to reproduce the results of the exepriments presented in the paper:

Active Task Disambiguation with LLMs published at ICLR 2025

Code Generation

This repository contains the code used for the code generation experiments presented in the paper.

run_human_eval.sh and run_apps.sh contain the command that should be run to reproduce the experimental results on the HumanEval and APPS benchmarks

We also provide the generated programs, queries, and their evaluation results with GPT-3.5-turbo and GPT-4o-mini in results. Results can be analysed in analyze_code_results.ipynb

In order to run code generation with OpenAI models, the openai API keys should be first inserted in src/utils.py

In order to run code generation with open source models from the huggingface library, we recommend setting up a local entrypoint via vllm, e.g. python -m vllm.entrypoints.openai.api_server --model meta-llama/Meta-Llama-3-8B-Instruct --tensor-parallel-size=1 --enforce-eager --disable-custom-all-reduce --host localhost --port 8000

src/code-generation/active_code_generation.py is the main file used for running the experiments

src/code-generation/reasoners.py implements two classess of Base and Active problem-solving agents generating binary or open queries.

src/code-generation/_execution.py contains code for exectuing LLM-generated programs in a sandbox enivronment

The 20 questions game

The code to reproduce the 20 questions game results is available in this repository.

Citation

@inproceedings{
kobalczyk2025active,
title={Active Task Disambiguation with {LLM}s},
author={Kasia Kobalczyk and Nicol{\'a}s Astorga and Tennison Liu and Mihaela van der Schaar},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=JAMxRSXLFz}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
config		config
data/code-generation		data/code-generation
results/code-generation		results/code-generation
src		src
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
analyze_code_resuts.ipynb		analyze_code_resuts.ipynb
environment.yaml		environment.yaml
figure1.png		figure1.png
run_apps.sh		run_apps.sh
run_human_eval.sh		run_human_eval.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Active Task Disambiguation

Code Generation

The 20 questions game

Citation

About

Uh oh!

Releases

Packages

Languages

jasperhyp/active-task-disambiguation

Folders and files

Latest commit

History

Repository files navigation

Active Task Disambiguation

Code Generation

The 20 questions game

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages