We present UI-Voyager, a novel two-stage self-evolving mobile GUI agent. Our 4B model achieves a 81.0% success rate on AndroidWorld benchmark, outperforming numerous recent baselines and exceeding human-level performance.
Overview of UI-Voyager performance on AndroidWorld
UI-Voyager consists of two iterative stages:
- π· Rejection Fine-Tuning (RFT), where a base policy generates multiple trajectories that are filtered by a rule-based verifier to collect high-quality samples for supervised fine-tuning.
- πΆ Group Relative Self-Distillation (GRSD), which identifies fork points between successful and failed trajectory groups using SSIM matching and corrects erroneous actions to further refine the policy
$\pi_m$ through mixed-data training.
- [2026.04.03] We have released the code for fork point detection, which is used to generate training data for GRSD. See more details here.
- [2026.03.26] Paper: Our paper is now available on arXiv.
- [2026.03.26] Model Release: UI-Voyager is released on HuggingFace.
You must have an Android Virtual Device (AVD) available for emulator startup. For AVD creation and emulator setup, you can follow the AndroidWorld installation guide: google-research/android_world, google-deepmind/android_env.
By default, scripts assume:
AVD_NAME=AndroidWorldAvd- your emulator binary is at
/root/android/emulator/emulator
If your setup differs, override these variables when running run_android_world.sh (see below).
pip install -r androidworld/requirements.txt
python3 android_env/setup.py installDownload the model from HuggingFace:
huggingface-cli download --resume-download MarsXL/UI-Voyager --local-dir /path/to/ui-voyagerDeploy the model using vLLM:
vllm serve /path/to/ui-voyager \
--served-model-name UI-Voyager \
--host 0.0.0.0 \
--port 8080 \
--tensor-parallel-size 1The default YAML (androidworld/eval/configs/UI-Voyager.yaml) uses:
llm.base_url: http://localhost:8000llm.model: UI-Voyager
NUM_WORKERS=4 CONFIG_NAME=UI-Voyager MODEL_NAME=UI-Voyager ./run_android_world.shAfter ./run_android_world.sh returns, it prints the main PID, the main log file path, and the output (artifacts) directory. Use those paths directly.
To stop a running evaluation, prefer passing the log directory of that run (the folder that contains eval.pid):
./stop_android_world.sh /path/to/eval_results/<MODEL_NAME>/logs/<timestamp>If you call ./stop_android_world.sh with no arguments, it may not resolve the correct logs/<timestamp> folder; in that case stop manually:
kill "$(cat eval_results/<MODEL_NAME>/logs/<timestamp>/eval.pid)"Default config:
androidworld/eval/configs/UI-Voyager.yaml
Key sections:
env.*: emulator/ports/ADB pathsllm.*: OpenAI-compatible endpoint + model nameagent.*: prompt name, action loop params, history length, SFT output direval.*: which AndroidWorld task suite to run and the output path
Start parallel evaluation:
NUM_WORKERS=4 CONFIG_NAME=UI-Voyager MODEL_NAME=UI-Voyager ./run_android_world.shDirectory layout (run_android_world.sh)
| Path | Contents |
|---|---|
eval_results/<MODEL_NAME>/logs/<TIMESTAMP>/ |
All logs and merged summaries for that launch |
eval_results/<MODEL_NAME>/results/<TIMESTAMP>/ |
Runtime config.yaml (script patches sft_data_dir here) and optional SFT rollouts |
If you find this work useful, please consider giving a star π and citation:
@misc{lin2026uivoyager,
title={UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience},
author={Zichuan Lin and Feiyu Liu and Yijun Yang and Jiafei Lyu and Yiming Gao and Yicheng Liu and Zhicong Lu and Yangbin Yu and Mingyu Yang and Junyou Li and Deheng Ye and Jie Jiang},
year={2026},
eprint={2603.24533},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2603.24533},
}
