Buckets:
| Name | Size | Uploaded | Xet hash |
|---|---|---|---|
| MacArena.utm | 4 items | ||
| osworld.utm | 4 items | ||
| README.md | 3.11 kB xet | ae58f177 |
MacArena — Virtual Machine Images
Pre-built macOS virtual machine images for the MacArena benchmark — a framework for evaluating computer-use agents on live macOS environments.
Contents
This bucket contains two VM images:
| Folder | VM Name | Use with |
|---|---|---|
MacArena.utm |
macarena |
MacArena custom tasks (evaluation_examples/macarena/) |
osworld.utm |
osworld |
OSWorld + macOSWorld tasks (evaluation_examples/osworld/, evaluation_examples/macosworld/) |
Requirements
- macOS host machine
- UTM — download from github.com/utmapp/UTM (Apache 2.0)
Setup
1. Install UTM
brew install --cask utm
Or download directly from https://github.com/utmapp/UTM.
2. Download a VM image
Download the desired .utm folder from this bucket.
3. Import into UTM
Double-click the .utm file — UTM will import it automatically — or drag it into the UTM window.
4. Run the benchmark
Clone the MacArena repository and point the runner at the imported VM:
# Manual mode
python3 macos_test.py --vm_name macarena --dir ./evaluation_examples/macarena/
# Automated mode
python3 -m runners.run_general \
--path_to_vm "macarena" \
--test_all_meta_path "evaluation_examples/test_split.json" \
--model "ByteDance-Seed/UI-TARS-1.5-7B" \
--base_url "<url_to_model>" \
--max_steps 15
Use --vm_name osworld when running OSWorld or macOSWorld tasks against osworld.utm.
Which VM do I need?
- Running MacArena tasks → use
MacArena.utmwith--vm_name macarena - Running OSWorld or macOSWorld tasks → use
osworld.utmwith--vm_name osworld
Legal
Use of these VM images is subject to Apple's macOS Software License Agreement. You are responsible for ensuring you have a valid license to run macOS on your hardware. Tasks sourced from macOSWorld are licensed under CC BY-NC 4.0 and may not be used for commercial purposes.
See the MacArena repository for the full legal disclaimer.
Citation
If you use these VM images in your research, please cite:
@misc{muryn-etal-2026-aiwild-macarena,
author = {Victor Muryn and Maksym Shamrai and Sofiia Mazepa and Yehor Khodysko},
title = {MacArena: Benchmarking Computer Use Agents on an Online macOS Environment},
month = {June},
year = {2026},
eprint = {2606.06560},
eprinttype = {arxiv},
eprintclass = {cs.LG},
url = {https://arxiv.org/abs/2606.06560},
urldate = {2026-06-08},
note = {\emph{Accepted to AIWILD @ ICML 2026}},
}
- Total size
- 121 GB
- Files
- 9
- Last updated
- Jun 8
- Pre-warmed CDN
- US EU US EU