🗞️ News

Multi-Agent MT is a pipeline-based AI agent framework that progressively refines translations via Translate → Post-edit → Proofread.

🗞️ News

Dec 27, 2025 — Version 1.0 officially released
Nov 8, 2025 — Our work was accepted and presented at the Conference on Machine Translation (WMT) 2025

✺ Features

Support for both single-task and multi-task execution modes
Integration of Rubric-MQM as an automatic post-editing (APE) component
Fully asynchronous OpenAI API integration

✺ Quick Start

➡ Set your OpenAI API key

export OPENAI_API_KEY=sk-xxxx
# or
export OPENAI_API_KEYS=sk-key1,sk-key2

➡ Clone Rubric-MQM as a submodule

This project uses Rubric-MQM (version 2.0) as an automatic post-editing (APE) component. Clone it at the v2.0 release under the name rubric_mqm and add it to PYTHONPATH.

git clone --branch v2.0 https://github.com/trotacodigos/Rubric-MQM.git rubric_mqm

export PYTHONPATH=$PYTHONPATH:/your/full/path/to/rubric_mqm

➡ Configuration (YAML)

You can run the system in either single-task mode (one agent) or multi-task mode (full pipeline: translate → postedit → proofread). Model selection and decoding parameters are fully configurable via YAML.

🤖 Single-tasker Example (config file ↗)

task: proofread

model:
    name: gpt-5
    temperature: 0.7
    max_tokens: 1024

🤖🤖🤖 Multi-tasker Example (config file ↗)

model:
    translate:
        name: gpt-4.1
        temperature: 0.7
        max_tokens: 1024
    postedit:
        name: gpt-4o
        temperature: 0.7
        max_tokens: 1024
    name: gpt-5
        temperature: 0.7
        max_tokens: 1024

Additional notes:

The target field is ignored by the Translation agent, but is required by the Post-edit and Proofread agents, where it serves as the initial hypothesis.
If you already have a translation, you can skip the Translate Agent by setting skip_translate_if_provided: true in your multi-task configuration.
- Skips the translate step and directly proceeds to postedit → proofread
- Only available in multi-task mode
During multi-task execution, agents iteratively upate the target field.

➡ Prepare your data

Input data must be provided as a CSV file. The required columns for all modes are:

src_lang
tgt_lang
src_text

Example CSV format


src_lang	tgt_lang	src_text	target	ref_text	domain
...	...	...	...	...	...

✺ Input & Output

🧑‍🏫 Source: 你永远主动联系不上这个专员，也不知道她的工号，也没有直线联系电话，就是你联系不上她，只有她联系你。

🧑‍🏫 Reference: Since you don't know the commissioner's job number and there isn't a direct phone number to call, you'll never make the effort to get in touch with her, She is the only one who can reach you, You can't.

🤖 Translate: You never actively contact the commissioner, you never know her job number, you never have a direct telephone line, you never contact her, she only contacts you.

🤖 Postedit: You can never proactively reach this specialist. You don't know her employee ID, nor do you have a direct phone number. It's always that you cannot contact her; only she can contact you.

🤖 Proofread: You can never proactively contact the commissioner, you never know her employee ID, you never have a direct telephone line, you cannot reach her, she only contacts you.

🤖🤖🤖 Multi-agent Translation^[1]: You can never proactively reach this commissioner, as you don’t know her employee ID or have a direct phone number; only she contacts you, and you cannot get in touch with her.

^[1] This translation differs from those produced by single-agent systems, which generate outputs based on a previously provided translation. In contrast, the multi-agent approach performs the translation process collaboratively from scratch.

✺ Project Structure

Multi-AgentMT/
├─ agents/
│  ├─ run.py                 # CLI entry point
│  ├─ core/
│  │  ├─ engine.py           # Async batch execution engine
│  │  └─ call_api.py         # Async OpenAI API wrapper
│  ├─ modules/
│  │  ├─ singletasker.py     # Single-task execution
│  │  └─ multitasker.py      # Multi-task pipeline
│  │  └─ dispatcher/         # Prompt & parameter dispatch
│  ├─ parser/                # Output parsing
│  └─ prompt/                # Prompt templates
│
├─ rubric_mqm/               # (submodule) MQM evaluation toolkit
│  └─ metric/
│
├─ data/
│  └─ sample.csv
│
└─ agents/config/
   ├─ single.yaml
   └─ multi.yaml

✺ Citation

If you use this framework in your research or projects, please cite it as follows:

@inproceedings{kim-2025-multi,
    title = "Multi-agent{MT}: Deploying {AI} Agent in the {WMT}25 Shared Task",
    author = "Kim, Ahrii",
    editor = "Haddow, Barry  and
      Kocmi, Tom  and
      Koehn, Philipp  and
      Monz, Christof",
    booktitle = "Proceedings of the Tenth Conference on Machine Translation",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.wmt-1.53/",
    doi = "10.18653/v1/2025.wmt-1.53",
    pages = "769--777",
    ISBN = "979-8-89176-341-8",
    abstract = "We present Multi-agentMT, our system for the WMT25 General Shared Task. The model adopts Prompt Chaining, a multi-agent workflow combined with Rubric-MQM, an automatic MQM-based error annotation metric. Our primary submission follows a Translate{--}Postedit{--}Proofread pipeline, in which error positions are explicitly marked and iteratively refined. Results suggest that a semi-autonomous agent scheme for machine translation is feasible with a smaller, earlier-generation model in low-resource settings, achieving comparable quality at roughly half the cost of larger systems."
}

@inproceedings{kim-2025-preliminary,
    title = "A Preliminary Study of {AI} Agent Model in Machine Translation",
    author = "Kim, Ahrii",
    editor = "Haddow, Barry  and
      Kocmi, Tom  and
      Koehn, Philipp  and
      Monz, Christof",
    booktitle = "Proceedings of the Tenth Conference on Machine Translation",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.wmt-1.32/",
    doi = "10.18653/v1/2025.wmt-1.32",
    pages = "583--586",
    ISBN = "979-8-89176-341-8",
    abstract = "We present IR{\_}Multi-agentMT, our submission to the WMT25 General Shared Task. The system adopts an AI-agent paradigm implemented through a multi-agent workflow, Prompt Chaining, in combination with RUBRIC-MQM, an automatic MQM-based error annotation metric. Our primary configuration follows the Translate{--}Postedit{--}Proofread paradigm, where each stage progressively enhances translation quality. We conduct a preliminary study to investigate (i) the impact of initial translation quality and (ii) the effect of enforcing explicit responses from the Postedit Agent. Our findings highlight the importance of both factors in shaping the overall performance of multi-agent translation systems."
}

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
agents		agents
analysis		analysis
data		data
rubric_mqm @ 28ce35e		rubric_mqm @ 28ce35e
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🗞️ News

✺ Features

✺ Quick Start

➡ Set your OpenAI API key

➡ Clone Rubric-MQM as a submodule

➡ Configuration (YAML)

Additional notes:

➡ Prepare your data

✺ Input & Output

✺ Project Structure

✺ Citation

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

trotacodigos/MultiAgentMT

Folders and files

Latest commit

History

Repository files navigation

🗞️ News

✺ Features

✺ Quick Start

➡ Set your OpenAI API key

➡ Clone Rubric-MQM as a submodule

➡ Configuration (YAML)

Additional notes:

➡ Prepare your data

✺ Input & Output

✺ Project Structure

✺ Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages