Instructions to use cabal-ai/mantis with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use cabal-ai/mantis with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("Hcompany/Holo3-35B-A3B")
model = PeftModel.from_pretrained(base_model, "cabal-ai/mantis")

Transformers

How to use cabal-ai/mantis with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="cabal-ai/mantis")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("cabal-ai/mantis", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use cabal-ai/mantis with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "cabal-ai/mantis"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "cabal-ai/mantis",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/cabal-ai/mantis

SGLang

How to use cabal-ai/mantis with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "cabal-ai/mantis" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "cabal-ai/mantis",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "cabal-ai/mantis" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "cabal-ai/mantis",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use cabal-ai/mantis with Docker Model Runner:
```
docker model run hf.co/cabal-ai/mantis
```

Mantis

Mantis is an open-weights computer-use model checkpoint fine-tuned from Hcompany/Holo3-35B-A3B. It is trained on graded Mantis agent rollouts from Augur, using supervised fine-tuning over real browser/workflow traces rather than generic chat data.

This release includes both the PEFT LoRA adapter and a ready-to-serve merged.Q8_0.gguf artifact for llama.cpp-style serving.

Why This Is Better For Mantis

Mantis is not a general chat fine-tune. It is specialized for the slow improvement loop of a computer-use agent:

Agent-native data: trained from Mantis rollouts with task context, model I/O, rewards, and action traces.
Computer-use alignment: targets GUI/navigation behavior on realistic browser tasks instead of instruction-following only.
Deployment-ready release: ships the small adapter plus a merged Q8_0 GGUF, so downstream serving stacks can either compose with the base model or run the merged artifact directly.
Auditable provenance: checkpoint id sft-c3e0d799f432-f00fa0 ties this release to the training data/config hash used by the Mantis trainer registry.

The current internal frozen holdout gate did not establish a reliable promotion over the base model, so this page does not claim a benchmark win over Holo3. The value of this release is open access to the specialized Mantis adaptation, its reproducible training pipeline, and its serving artifact.

Files

adapter_model.safetensors: PEFT LoRA adapter weights.
adapter_config.json: PEFT adapter configuration.
adapter.gguf: converted adapter artifact.
merged.Q8_0.gguf: merged full-model Q8_0 GGUF for direct serving.
tokenizer/processor files copied from the training artifact.
training_args.bin: trainer metadata from the SFT run.

Intended Use

Use this checkpoint for research and development of GUI agents, browser automation agents, and Mantis-compatible computer-use systems.

This model may be useful when you need:

a Holo3-derived checkpoint adapted to Mantis rollouts;
an open adapter for further fine-tuning;
a ready GGUF artifact for serving experiments;
a transparent artifact from a champion/challenger training loop.

Limitations

This model can make incorrect UI decisions and should not be allowed to take high-impact actions without supervision.
The released checkpoint is specialized for Mantis-style workflows; behavior outside that domain may not improve over the base model.
The internal gate found no reliable promotion over the base model on the frozen holdout available at release time.
Computer-use agents can interact with external systems. Use sandboxing, allowlists, rate limits, and human approval for sensitive workflows.

Base Model And License

This model is fine-tuned from Hcompany/Holo3-35B-A3B, whose model card declares the Apache-2.0 license. This release is also published under Apache-2.0 and retains upstream attribution.

Training

Base model: Hcompany/Holo3-35B-A3B
Method: supervised fine-tuning with TRL
Checkpoint id: sft-c3e0d799f432-f00fa0
Data source: graded Mantis rollouts pulled from Augur
Training stack: PEFT LoRA + TRL SFT

Citation

If you use the base model, cite Holo3:

@misc{hai2025holo3modelfamily,
  title={Holo3 - Open Foundation Models for Navigation and Computer Use Agents},
  author={H Company},
  year={2026},
  url={https://huggingface.co/Hcompany/Holo3-35B-A3B}
}

If you use the trainer stack, cite TRL:

@software{vonwerra2020trl,
  title={{TRL: Transformers Reinforcement Learning}},
  author={von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
  license={Apache-2.0},
  url={https://github.com/huggingface/trl},
  year={2020}
}