All notable changes to this project will be documented in this file.
- The InterCode webpage has been modified to be a leaderboard style π.
- If you evaluate on InterCode and would like to put your results on the leaderboard, please create an issue or email John directly π§.
- We wrote a standalone report describing the operational InterCode-CTF π© environment, a dataset of 100 task instances, and our initial experiments.
- π¨ New Environment! The recently released SWE-bench benchmark introduces software engineering as a task. To support agent-based approaches, we have released the IC-SWE-bench environment, which presents the SWE-bench task in an interactive setting!
βπ» John
Since its initial release, I am pleased to announce that InterCode has been extended to support a number of new languages and datasets. They are summarized as follows:
- New Supported Datasets:
- Python Support:
- Interpreter-Style Environment + Dockerfile
- Single Turn + Try Again results on Python Environment + MBPP will be uploaded soon to the
data/resultsfolder - Try it out with
python run_demo.py python
- CTF Environment:
ctf_env.pyhas been rewritten to:- Depend a single Dockerfile for multiple task instances
- Uses the
InterCodeEnvabstraction such that it is implemented in just 30 lines
- CTF environment has been integrated into the
run_demo.pyscript. Try it out withpython run_demo.py ctf - The CTF dataset will continually we increased in quantity as we source and create more problems.
βπ» John
Introducing the initial release of InterCode, a lightweight, flexible, and easy-to-use framework for designing interactive code environments. Please view the README.md and wiki pages for information on how to build and use InterCode.