Image-Text-to-Text
PEFT
Safetensors
GGUF
computer-use
gui-agent
multimodal
action-model
lora
sft
trl
mantis
conversational
Instructions to use cabal-ai/mantis with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use cabal-ai/mantis with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Hcompany/Holo3-35B-A3B") model = PeftModel.from_pretrained(base_model, "cabal-ai/mantis") - Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| base_model: | |
| - Hcompany/Holo3-35B-A3B | |
| library_name: peft | |
| pipeline_tag: image-text-to-text | |
| tags: | |
| - computer-use | |
| - gui-agent | |
| - multimodal | |
| - action-model | |
| - lora | |
| - gguf | |
| - sft | |
| - trl | |
| - mantis | |
| model_name: Mantis | |
| # Mantis | |
| Mantis is an open-weights computer-use model checkpoint fine-tuned from | |
| [Hcompany/Holo3-35B-A3B](https://huggingface.co/Hcompany/Holo3-35B-A3B). It is | |
| trained on graded Mantis agent rollouts from Augur, using supervised fine-tuning | |
| over real browser/workflow traces rather than generic chat data. | |
| This release includes both the PEFT LoRA adapter and a ready-to-serve | |
| `merged.Q8_0.gguf` artifact for llama.cpp-style serving. | |
| ## Why This Is Better For Mantis | |
| Mantis is not a general chat fine-tune. It is specialized for the slow | |
| improvement loop of a computer-use agent: | |
| - **Agent-native data**: trained from Mantis rollouts with task context, | |
| model I/O, rewards, and action traces. | |
| - **Computer-use alignment**: targets GUI/navigation behavior on realistic | |
| browser tasks instead of instruction-following only. | |
| - **Deployment-ready release**: ships the small adapter plus a merged Q8_0 GGUF, | |
| so downstream serving stacks can either compose with the base model or run the | |
| merged artifact directly. | |
| - **Auditable provenance**: checkpoint id `sft-c3e0d799f432-f00fa0` ties this | |
| release to the training data/config hash used by the Mantis trainer registry. | |
| The current internal frozen holdout gate did not establish a reliable promotion | |
| over the base model, so this page does **not** claim a benchmark win over Holo3. | |
| The value of this release is open access to the specialized Mantis adaptation, | |
| its reproducible training pipeline, and its serving artifact. | |
| ## Files | |
| - `adapter_model.safetensors`: PEFT LoRA adapter weights. | |
| - `adapter_config.json`: PEFT adapter configuration. | |
| - `adapter.gguf`: converted adapter artifact. | |
| - `merged.Q8_0.gguf`: merged full-model Q8_0 GGUF for direct serving. | |
| - tokenizer/processor files copied from the training artifact. | |
| - `training_args.bin`: trainer metadata from the SFT run. | |
| ## Intended Use | |
| Use this checkpoint for research and development of GUI agents, browser | |
| automation agents, and Mantis-compatible computer-use systems. | |
| This model may be useful when you need: | |
| - a Holo3-derived checkpoint adapted to Mantis rollouts; | |
| - an open adapter for further fine-tuning; | |
| - a ready GGUF artifact for serving experiments; | |
| - a transparent artifact from a champion/challenger training loop. | |
| ## Limitations | |
| - This model can make incorrect UI decisions and should not be allowed to take | |
| high-impact actions without supervision. | |
| - The released checkpoint is specialized for Mantis-style workflows; behavior | |
| outside that domain may not improve over the base model. | |
| - The internal gate found no reliable promotion over the base model on the | |
| frozen holdout available at release time. | |
| - Computer-use agents can interact with external systems. Use sandboxing, | |
| allowlists, rate limits, and human approval for sensitive workflows. | |
| ## Base Model And License | |
| This model is fine-tuned from `Hcompany/Holo3-35B-A3B`, whose model card declares | |
| the Apache-2.0 license. This release is also published under Apache-2.0 and | |
| retains upstream attribution. | |
| ## Training | |
| - Base model: `Hcompany/Holo3-35B-A3B` | |
| - Method: supervised fine-tuning with TRL | |
| - Checkpoint id: `sft-c3e0d799f432-f00fa0` | |
| - Data source: graded Mantis rollouts pulled from Augur | |
| - Training stack: PEFT LoRA + TRL SFT | |
| ## Citation | |
| If you use the base model, cite Holo3: | |
| ```bibtex | |
| @misc{hai2025holo3modelfamily, | |
| title={Holo3 - Open Foundation Models for Navigation and Computer Use Agents}, | |
| author={H Company}, | |
| year={2026}, | |
| url={https://huggingface.co/Hcompany/Holo3-35B-A3B} | |
| } | |
| ``` | |
| If you use the trainer stack, cite TRL: | |
| ```bibtex | |
| @software{vonwerra2020trl, | |
| title={{TRL: Transformers Reinforcement Learning}}, | |
| author={von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin}, | |
| license={Apache-2.0}, | |
| url={https://github.com/huggingface/trl}, | |
| year={2020} | |
| } | |
| ``` | |