Skip to content

Latest commit

Β 

History

History
36 lines (26 loc) Β· 2.53 KB

File metadata and controls

36 lines (26 loc) Β· 2.53 KB

Changelog

All notable changes to this project will be documented in this file.

[1.0.2] - 10/19/2023

  • The InterCode webpage has been modified to be a leaderboard style πŸ†.
  • If you evaluate on InterCode and would like to put your results on the leaderboard, please create an issue or email John directly πŸ“§.
  • We wrote a standalone report describing the operational InterCode-CTF 🚩 environment, a dataset of 100 task instances, and our initial experiments.
  • 🚨 New Environment! The recently released SWE-bench benchmark introduces software engineering as a task. To support agent-based approaches, we have released the IC-SWE-bench environment, which presents the SWE-bench task in an interactive setting!

✍🏻 John

[1.0.1] - 8/15/2023

Since its initial release, I am pleased to announce that InterCode has been extended to support a number of new languages and datasets. They are summarized as follows:

  • New Supported Datasets:
  • Python Support:
    • Interpreter-Style Environment + Dockerfile
    • Single Turn + Try Again results on Python Environment + MBPP will be uploaded soon to the data/results folder
    • Try it out with python run_demo.py python
  • CTF Environment:
    • ctf_env.py has been rewritten to:
      • Depend a single Dockerfile for multiple task instances
      • Uses the InterCodeEnv abstraction such that it is implemented in just 30 lines
    • CTF environment has been integrated into the run_demo.py script. Try it out with python run_demo.py ctf
    • The CTF dataset will continually we increased in quantity as we source and create more problems.

✍🏻 John

[1.0.0] - 7/11/2023

Introducing the initial release of InterCode, a lightweight, flexible, and easy-to-use framework for designing interactive code environments. Please view the README.md and wiki pages for information on how to build and use InterCode.