--- license: apache-2.0 base_model: - Hcompany/Holo3-35B-A3B library_name: peft pipeline_tag: image-text-to-text tags: - computer-use - gui-agent - multimodal - action-model - lora - gguf - sft - trl - mantis model_name: Mantis --- # Mantis Mantis is an open-weights computer-use model checkpoint fine-tuned from [Hcompany/Holo3-35B-A3B](https://huggingface.co/Hcompany/Holo3-35B-A3B). It is trained on graded Mantis agent rollouts from Augur, using supervised fine-tuning over real browser/workflow traces rather than generic chat data. This release includes both the PEFT LoRA adapter and a ready-to-serve `merged.Q8_0.gguf` artifact for llama.cpp-style serving. ## Why This Is Better For Mantis Mantis is not a general chat fine-tune. It is specialized for the slow improvement loop of a computer-use agent: - **Agent-native data**: trained from Mantis rollouts with task context, model I/O, rewards, and action traces. - **Computer-use alignment**: targets GUI/navigation behavior on realistic browser tasks instead of instruction-following only. - **Deployment-ready release**: ships the small adapter plus a merged Q8_0 GGUF, so downstream serving stacks can either compose with the base model or run the merged artifact directly. - **Auditable provenance**: checkpoint id `sft-c3e0d799f432-f00fa0` ties this release to the training data/config hash used by the Mantis trainer registry. The current internal frozen holdout gate did not establish a reliable promotion over the base model, so this page does **not** claim a benchmark win over Holo3. The value of this release is open access to the specialized Mantis adaptation, its reproducible training pipeline, and its serving artifact. ## Files - `adapter_model.safetensors`: PEFT LoRA adapter weights. - `adapter_config.json`: PEFT adapter configuration. - `adapter.gguf`: converted adapter artifact. - `merged.Q8_0.gguf`: merged full-model Q8_0 GGUF for direct serving. - tokenizer/processor files copied from the training artifact. - `training_args.bin`: trainer metadata from the SFT run. ## Intended Use Use this checkpoint for research and development of GUI agents, browser automation agents, and Mantis-compatible computer-use systems. This model may be useful when you need: - a Holo3-derived checkpoint adapted to Mantis rollouts; - an open adapter for further fine-tuning; - a ready GGUF artifact for serving experiments; - a transparent artifact from a champion/challenger training loop. ## Limitations - This model can make incorrect UI decisions and should not be allowed to take high-impact actions without supervision. - The released checkpoint is specialized for Mantis-style workflows; behavior outside that domain may not improve over the base model. - The internal gate found no reliable promotion over the base model on the frozen holdout available at release time. - Computer-use agents can interact with external systems. Use sandboxing, allowlists, rate limits, and human approval for sensitive workflows. ## Base Model And License This model is fine-tuned from `Hcompany/Holo3-35B-A3B`, whose model card declares the Apache-2.0 license. This release is also published under Apache-2.0 and retains upstream attribution. ## Training - Base model: `Hcompany/Holo3-35B-A3B` - Method: supervised fine-tuning with TRL - Checkpoint id: `sft-c3e0d799f432-f00fa0` - Data source: graded Mantis rollouts pulled from Augur - Training stack: PEFT LoRA + TRL SFT ## Citation If you use the base model, cite Holo3: ```bibtex @misc{hai2025holo3modelfamily, title={Holo3 - Open Foundation Models for Navigation and Computer Use Agents}, author={H Company}, year={2026}, url={https://huggingface.co/Hcompany/Holo3-35B-A3B} } ``` If you use the trainer stack, cite TRL: ```bibtex @software{vonwerra2020trl, title={{TRL: Transformers Reinforcement Learning}}, author={von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin}, license={Apache-2.0}, url={https://github.com/huggingface/trl}, year={2020} } ```