Instructions to use cabal-ai/mantis with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use cabal-ai/mantis with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Hcompany/Holo3-35B-A3B") model = PeftModel.from_pretrained(base_model, "cabal-ai/mantis") - Transformers
How to use cabal-ai/mantis with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="cabal-ai/mantis") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("cabal-ai/mantis", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use cabal-ai/mantis with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "cabal-ai/mantis" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cabal-ai/mantis", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/cabal-ai/mantis
- SGLang
How to use cabal-ai/mantis with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "cabal-ai/mantis" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cabal-ai/mantis", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "cabal-ai/mantis" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cabal-ai/mantis", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use cabal-ai/mantis with Docker Model Runner:
docker model run hf.co/cabal-ai/mantis
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("cabal-ai/mantis", dtype="auto")Mantis
Mantis is an open-weights computer-use model checkpoint fine-tuned from Hcompany/Holo3-35B-A3B. It is trained on graded Mantis agent rollouts from Augur, using supervised fine-tuning over real browser/workflow traces rather than generic chat data.
This release includes both the PEFT LoRA adapter and a ready-to-serve
merged.Q8_0.gguf artifact for llama.cpp-style serving.
Why This Is Better For Mantis
Mantis is not a general chat fine-tune. It is specialized for the slow improvement loop of a computer-use agent:
- Agent-native data: trained from Mantis rollouts with task context, model I/O, rewards, and action traces.
- Computer-use alignment: targets GUI/navigation behavior on realistic browser tasks instead of instruction-following only.
- Deployment-ready release: ships the small adapter plus a merged Q8_0 GGUF, so downstream serving stacks can either compose with the base model or run the merged artifact directly.
- Auditable provenance: checkpoint id
sft-c3e0d799f432-f00fa0ties this release to the training data/config hash used by the Mantis trainer registry.
The current internal frozen holdout gate did not establish a reliable promotion over the base model, so this page does not claim a benchmark win over Holo3. The value of this release is open access to the specialized Mantis adaptation, its reproducible training pipeline, and its serving artifact.
Files
adapter_model.safetensors: PEFT LoRA adapter weights.adapter_config.json: PEFT adapter configuration.adapter.gguf: converted adapter artifact.merged.Q8_0.gguf: merged full-model Q8_0 GGUF for direct serving.- tokenizer/processor files copied from the training artifact.
training_args.bin: trainer metadata from the SFT run.
Intended Use
Use this checkpoint for research and development of GUI agents, browser automation agents, and Mantis-compatible computer-use systems.
This model may be useful when you need:
- a Holo3-derived checkpoint adapted to Mantis rollouts;
- an open adapter for further fine-tuning;
- a ready GGUF artifact for serving experiments;
- a transparent artifact from a champion/challenger training loop.
Limitations
- This model can make incorrect UI decisions and should not be allowed to take high-impact actions without supervision.
- The released checkpoint is specialized for Mantis-style workflows; behavior outside that domain may not improve over the base model.
- The internal gate found no reliable promotion over the base model on the frozen holdout available at release time.
- Computer-use agents can interact with external systems. Use sandboxing, allowlists, rate limits, and human approval for sensitive workflows.
Base Model And License
This model is fine-tuned from Hcompany/Holo3-35B-A3B, whose model card declares
the Apache-2.0 license. This release is also published under Apache-2.0 and
retains upstream attribution.
Training
- Base model:
Hcompany/Holo3-35B-A3B - Method: supervised fine-tuning with TRL
- Checkpoint id:
sft-c3e0d799f432-f00fa0 - Data source: graded Mantis rollouts pulled from Augur
- Training stack: PEFT LoRA + TRL SFT
Citation
If you use the base model, cite Holo3:
@misc{hai2025holo3modelfamily,
title={Holo3 - Open Foundation Models for Navigation and Computer Use Agents},
author={H Company},
year={2026},
url={https://huggingface.co/Hcompany/Holo3-35B-A3B}
}
If you use the trainer stack, cite TRL:
@software{vonwerra2020trl,
title={{TRL: Transformers Reinforcement Learning}},
author={von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouรฉdec, Quentin},
license={Apache-2.0},
url={https://github.com/huggingface/trl},
year={2020}
}
- Downloads last month
- -
8-bit
Model tree for cabal-ai/mantis
Base model
Qwen/Qwen3.5-35B-A3B-Base
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="cabal-ai/mantis") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)