Buckets:

victormuryn's picture
|
download
raw
3.11 kB
# MacArena — Virtual Machine Images
[![macOS](https://img.shields.io/badge/platform-macOS-brightgreen.svg?logo=apple)](https://www.apple.com/macos/)
[![arXiv](https://img.shields.io/badge/arXiv-2606.06560-b31b1b.svg)](https://arxiv.org/abs/2606.06560)
[![GitHub](https://img.shields.io/badge/GitHub-MacArena-181717?logo=github)](https://github.com/MacPaw/MacArena)
Pre-built macOS virtual machine images for the [MacArena benchmark](https://github.com/MacPaw/MacArena) — a framework for evaluating computer-use agents on live macOS environments.
---
## Contents
This bucket contains two VM images:
| Folder | VM Name | Use with |
|--------|---------|----------|
| `MacArena.utm` | `macarena` | MacArena custom tasks (`evaluation_examples/macarena/`) |
| `osworld.utm` | `osworld` | OSWorld + macOSWorld tasks (`evaluation_examples/osworld/`, `evaluation_examples/macosworld/`) |
---
## Requirements
- **macOS** host machine
- **UTM** — download from [github.com/utmapp/UTM](https://github.com/utmapp/UTM) (Apache 2.0)
---
## Setup
### 1. Install UTM
```bash
brew install --cask utm
```
Or download directly from [https://github.com/utmapp/UTM](https://github.com/utmapp/UTM).
### 2. Download a VM image
Download the desired `.utm` folder from this bucket.
### 3. Import into UTM
Double-click the `.utm` file — UTM will import it automatically — or drag it into the UTM window.
### 4. Run the benchmark
Clone the MacArena repository and point the runner at the imported VM:
```bash
# Manual mode
python3 macos_test.py --vm_name macarena --dir ./evaluation_examples/macarena/
# Automated mode
python3 -m runners.run_general \
--path_to_vm "macarena" \
--test_all_meta_path "evaluation_examples/test_split.json" \
--model "ByteDance-Seed/UI-TARS-1.5-7B" \
--base_url "<url_to_model>" \
--max_steps 15
```
Use `--vm_name osworld` when running OSWorld or macOSWorld tasks against `osworld.utm`.
---
## Which VM do I need?
- Running **MacArena tasks** → use `MacArena.utm` with `--vm_name macarena`
- Running **OSWorld or macOSWorld tasks** → use `osworld.utm` with `--vm_name osworld`
---
## Legal
Use of these VM images is subject to Apple's macOS Software License Agreement. You are responsible for ensuring you have a valid license to run macOS on your hardware. Tasks sourced from macOSWorld are licensed under **CC BY-NC 4.0** and may not be used for commercial purposes.
See the [MacArena repository](https://github.com/MacPaw/MacArena) for the full legal disclaimer.
---
## Citation
If you use these VM images in your research, please cite:
```bibtex
@misc{muryn-etal-2026-aiwild-macarena,
author = {Victor Muryn and Maksym Shamrai and Sofiia Mazepa and Yehor Khodysko},
title = {MacArena: Benchmarking Computer Use Agents on an Online macOS Environment},
month = {June},
year = {2026},
eprint = {2606.06560},
eprinttype = {arxiv},
eprintclass = {cs.LG},
url = {https://arxiv.org/abs/2606.06560},
urldate = {2026-06-08},
note = {\emph{Accepted to AIWILD @ ICML 2026}},
}
```

Xet Storage Details

Size:
3.11 kB
·
Xet hash:
ae58f17737d4859511ef7f28e8321169ac868813c47452635fdc11eb48245bc3

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.