Mantis

Mantis is an open-weights computer-use model checkpoint fine-tuned from Hcompany/Holo3-35B-A3B. It is trained on graded Mantis agent rollouts from Augur, using supervised fine-tuning over real browser/workflow traces rather than generic chat data.

This release includes both the PEFT LoRA adapter and a ready-to-serve merged.Q8_0.gguf artifact for llama.cpp-style serving.

Why This Is Better For Mantis

Mantis is not a general chat fine-tune. It is specialized for the slow improvement loop of a computer-use agent:

Agent-native data: trained from Mantis rollouts with task context, model I/O, rewards, and action traces.
Computer-use alignment: targets GUI/navigation behavior on realistic browser tasks instead of instruction-following only.
Deployment-ready release: ships the small adapter plus a merged Q8_0 GGUF, so downstream serving stacks can either compose with the base model or run the merged artifact directly.
Auditable provenance: checkpoint id sft-c3e0d799f432-f00fa0 ties this release to the training data/config hash used by the Mantis trainer registry.

The current internal frozen holdout gate did not establish a reliable promotion over the base model, so this page does not claim a benchmark win over Holo3. The value of this release is open access to the specialized Mantis adaptation, its reproducible training pipeline, and its serving artifact.

Files

adapter_model.safetensors: PEFT LoRA adapter weights.
adapter_config.json: PEFT adapter configuration.
adapter.gguf: converted adapter artifact.
merged.Q8_0.gguf: merged full-model Q8_0 GGUF for direct serving.
tokenizer/processor files copied from the training artifact.
training_args.bin: trainer metadata from the SFT run.

Intended Use

Use this checkpoint for research and development of GUI agents, browser automation agents, and Mantis-compatible computer-use systems.

This model may be useful when you need:

a Holo3-derived checkpoint adapted to Mantis rollouts;
an open adapter for further fine-tuning;
a ready GGUF artifact for serving experiments;
a transparent artifact from a champion/challenger training loop.

Limitations

This model can make incorrect UI decisions and should not be allowed to take high-impact actions without supervision.
The released checkpoint is specialized for Mantis-style workflows; behavior outside that domain may not improve over the base model.
The internal gate found no reliable promotion over the base model on the frozen holdout available at release time.
Computer-use agents can interact with external systems. Use sandboxing, allowlists, rate limits, and human approval for sensitive workflows.

Base Model And License

This model is fine-tuned from Hcompany/Holo3-35B-A3B, whose model card declares the Apache-2.0 license. This release is also published under Apache-2.0 and retains upstream attribution.

Training

Base model: Hcompany/Holo3-35B-A3B
Method: supervised fine-tuning with TRL
Checkpoint id: sft-c3e0d799f432-f00fa0
Data source: graded Mantis rollouts pulled from Augur
Training stack: PEFT LoRA + TRL SFT

Citation

If you use the base model, cite Holo3:

@misc{hai2025holo3modelfamily,
  title={Holo3 - Open Foundation Models for Navigation and Computer Use Agents},
  author={H Company},
  year={2026},
  url={https://huggingface.co/Hcompany/Holo3-35B-A3B}
}

If you use the trainer stack, cite TRL:

@software{vonwerra2020trl,
  title={{TRL: Transformers Reinforcement Learning}},
  author={von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
  license={Apache-2.0},
  url={https://github.com/huggingface/trl},
  year={2020}
}

Downloads last month: -

GGUF

Model size

8.36M params

Architecture

qwen35moe

Hardware compatibility

8-bit

View +1 variant

Model tree for cabal-ai/mantis

Base model

Qwen/Qwen3.5-35B-A3B-Base

Finetuned

Qwen/Qwen3.5-35B-A3B

Finetuned

Hcompany/Holo3-35B-A3B

Adapter

(18)

this model