Instructions to use nuroai/eliot-9b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nuroai/eliot-9b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="nuroai/eliot-9b") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("nuroai/eliot-9b") model = AutoModelForMultimodalLM.from_pretrained("nuroai/eliot-9b") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use nuroai/eliot-9b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "nuroai/eliot-9b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nuroai/eliot-9b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/nuroai/eliot-9b
- SGLang
How to use nuroai/eliot-9b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "nuroai/eliot-9b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nuroai/eliot-9b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "nuroai/eliot-9b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nuroai/eliot-9b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use nuroai/eliot-9b with Docker Model Runner:
docker model run hf.co/nuroai/eliot-9b
Eliot 9B
Eliot 9B is an Apple-platform agent model: a 9B open-weight tool-calling model built to operate a Mac through a guarded macOS harness.
It is trained for the world where the computer is the interface. Eliot reads macOS Accessibility trees, reasons over apps and UI elements, and emits structured actions such as click, type, open_app, read, web_search, ask_user, and done. It is designed for local-first assistants, Mac automation, app workflows, and agentic desktop control where every action needs to be observable, auditable, and guardable.
This repository is the canonical full model: the merged BF16 Hugging Face checkpoint for Eliot v1.2.
Why Eliot
Most language models are trained to chat. Eliot is trained to act.
Instead of seeing pixels, Eliot sees the same semantic surface that assistive technologies use on macOS: roles, labels, values, app names, and element IDs. That makes the model fast to prompt, easier to inspect, and much easier to wrap in deterministic safety logic than screenshot-only computer-use systems.
Eliot is especially tuned for Apple-platform workflows:
| Capability | What it means |
|---|---|
| macOS Accessibility control | Operates through structured UI trees rather than raw screenshots. |
| One tool call per turn | Predictable agent loop for harnesses and product integrations. |
| Apple Silicon path | A separate MLX 4-bit build is available for resident local Mac use. |
| Harness-first safety | Destructive, external, credential, payment, network, and write actions must be intercepted by the runtime. |
| Local-first design | Built for assistants that can run close to user data instead of shipping every screen state to a remote service. |
Model Lineage
Qwen/Qwen3.5-9B
-> Eliot QLoRA v1.2
-> merged BF16 checkpoint
-> Eliot 9B
Eliot is fine-tuned from Qwen/Qwen3.5-9B using a Mac-agent dataset made from deterministic synthetic episodes, sanitized macOS Accessibility-tree contexts, and compatible replay data. The training target is structured tool use, confirmation behavior, failure recovery, and honest task completion inside a guarded desktop harness.
Downloads
| Artifact | Repo | Best For |
|---|---|---|
| Full BF16 checkpoint | nuroai/eliot-9b |
GPU serving, vLLM, research, further conversion. |
| MLX 4-bit build | nuroai/eliot-9b-mlx-4bit |
Apple Silicon Macs and local resident assistant use. |
Download the full model:
huggingface-cli download nuroai/eliot-9b --local-dir eliot-9b
Run the Apple Silicon build:
pip install mlx-lm
mlx_lm.server --model nuroai/eliot-9b-mlx-4bit --port 8081 --host localhost
Serving With vLLM
vllm serve nuroai/eliot-9b \
--served-model-name eliot-9b \
--max-model-len 32768 \
--enable-auto-tool-choice \
--tool-call-parser qwen3_coder \
--reasoning-parser qwen3
The GH200/vLLM validation path served this merged checkpoint as BF16 with no quantization.
Transformers Quick Start
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "nuroai/eliot-9b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{
"role": "system",
"content": "You are Eliot, an on-device computer-use agent. Respond with exactly one tool call per turn."
},
{
"role": "user",
"content": "TASK: Open Notes and create a note called Project Ideas.\nAPP: Finder\nUI:\n[1] menu item File\n[2] button Search\nRESULT: none"
},
]
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False,
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(**inputs, max_new_tokens=256, do_sample=False)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=False))
Action Interface
Eliot is intended to be used with a harness that supplies the current task, frontmost app, Accessibility tree, and previous action result. The model then emits exactly one tool call.
| Tool | Purpose |
|---|---|
click |
Click, press, or select an accessible UI element. |
type |
Focus an element and replace or enter text. |
open_app |
Launch or foreground a macOS app. |
read |
Read accessible text or values. |
run_shell |
Use local shell only when allowed by policy. |
web_search |
Search the web when live facts are required. |
invoke_intent |
Call a structured app or system intent exposed by the harness. |
ask_user |
Ask for clarification or approval. |
done |
Finish honestly with the outcome. |
The recommended system prompt is included as eliot_system_prompt.txt.
Evaluation Snapshot
Representative held-out MacBench and live-gate results from the v1.2 release process:
| Check | Result |
|---|---|
| MacBench action match, BF16 | 97.7% |
| Destructive confirmation | 100.0% |
| Abstention handling | 100.0% |
| Error recovery | 100.0% |
| Live destructive confirmation, shipping quantization | 100.0% |
| Guarded destructive interception | 100.0% |
| Benign over-ask, live held-out set | 1 / 4 |
These numbers measure the model and reference harness on structured Mac-agent tasks. They are not a claim that the model can safely operate a computer without a deterministic safety layer.
Safety Model
Eliot is not the safety boundary.
The model is trained to ask before destructive actions, but real deployments must enforce policy outside the model. A production harness should intercept destructive, external, credential, payment, network, file-write, send, delete, and shell actions before execution. The reference Eliot/OpenSiri harness follows this design: the model proposes an action, the guard decides whether it can run, needs approval, or must be denied.
Do not deploy Eliot as an unguarded remote-control agent.
Intended Use
Eliot is intended for:
- macOS assistant research and prototyping.
- Local-first desktop automation.
- Tool-calling and computer-use harness development.
- Apple Silicon assistant experiments using the MLX 4-bit build.
- Evaluation of Accessibility-tree-based agent interfaces.
Out-of-scope uses include stealth automation, credential extraction, unauthorized access, bypassing user consent, spam, payment automation without approval, or any deployment where user data is acted on without clear visibility and control.
License And Attribution
Eliot is released under Apache 2.0. The base model is Qwen/Qwen3.5-9B, also Apache 2.0.
Eliot is an independent project by Nuro AI Labs. It is designed for Apple-platform and macOS workflows, but it is not created by, endorsed by, sponsored by, or affiliated with Apple Inc.
- Downloads last month
- 115
Model tree for nuroai/eliot-9b
Evaluation results
- Action match on MacBenchself-reported97.700
- Destructive confirmation on MacBenchself-reported100.000
- Abstention handling on MacBenchself-reported100.000
- Error recovery on MacBenchself-reported100.000