Holo3-35B-A3B-FP8
Holo3-35B-A3B-FP8 is an FP8-compressed evolution of Holo3-35B-A3B. This variant leverages F32 · F8_E4M3 precision formats to significantly reduce memory footprint and improve inference efficiency while maintaining strong output quality. Holo3-35B-A3B from H Company is a state-of-the-art sparse Mixture-of-Experts (MoE) vision-language model with 35B total parameters but only 3B active per inference, fine-tuned from Qwen3.5-35B-A3B for GUI navigation and computer-use agents operating across web, desktop, and mobile environments. Achieving SOTA 77.8% on the OSWorld-Verified benchmark — surpassing proprietary models at dramatically lower cost and latency — it excels at interpreting visual interfaces, reasoning over complex content, and executing precise actions such as form filling, spreadsheet editing, and browser control for enterprise automation. It is Apache 2.0-licensed with open weights available on Hugging Face.
Recipe
default_stage:
default_modifiers:
QuantizationModifier:
targets: [Linear]
ignore: ['re:.*lm_head', 're:visual.*', 're:model.visual.*', 're:.*mlp.gate$', 're:.*embed_tokens$',
're:.*shared_expert_gate$', 're:.*linear_attn.*']
scheme: FP8_DYNAMIC
bypass_divisibility_checks: false
Holo3: Foundational Models for Navigation and Computer Use Agents
Model Description
Holo3 is our latest generation of large-scale Vision-Language Models (VLMs) specifically optimized for GUI Agents. Like its predecessors, it operates across diverse digital environments—web, desktop, and mobile—by interpreting visual interfaces, reasoning over complex content, and executing precise actions.
Holo3 achieves state-of-the-art performance on OSWorld-Verified, setting a new benchmark for computer use agents. While it retains the world-class web navigation capabilities of Holo2, the new Holo3-35B-A3B architecture is designed to thrive in realistic business environments.
- Developed by: H Company
- Model type: Vision-Language Model for Navigation and Computer Use Agents
- Architecture: Sparse Mixture-of-Experts (MoE) with 35B total / 3B active parameters
- Fine-tuned from model: Qwen/Qwen3.5-35B-A3B
- Blog Post: hcompany.ai/holo3
- Quickstart: hub.hcompany.ai/quickstart
- License: Apache 2.0 License

Get Started
Explore our Quickstart guide to learn how to integrate with our inference API.
Training Strategy
Holo3-35B-A3B is based on the Qwen3.5 architecture and has been reinforced to strengthen its core agentic pillars: perception and decision-making. The training pipeline utilizes a carefully curated mix of open-source datasets, large-scale synthetic trajectories, and high-quality human-annotated samples to ensure reliable multi-step reasoning.
Results
State-of-the-Art Navigation (OSWorld-Verified)
To benchmark Holo3 on computer use and web navigation, we utilized the OSWorld and WebArena benchmarks. Holo3-35B-A3B achieves a 77.8% score on OSWorld-Verified. Remarkably, it achieves this with only 3B active parameters, providing SOTA performance at a fraction of the inference cost of leading proprietary models.
Enterprise Readiness (H Corporate Benchmark)
To measure real-world utility, we developed the H Corporate Benchmark: a dedicated evaluation suite of 486 multi-step tasks across four categories: E-commerce, Business Software, Collaboration, and Multi-App workflows. Holo3 consistently outperforms significantly larger competitors in these dense, business-logic environments.
UI Localization & Grounding
A world-class agent must see before it can act. Holo3 excels at localizing interaction elements and understanding their functions, as evidenced by top-tier performance on ScreenSpot-Pro and OSWorld-G.
Table 1: Evaluation results on computer use and grounding benchmarks.

Citation
@misc{hai2025holo3modelfamily,
title={Holo3 - Open Foundation Models for Navigation and Computer Use Agents},
author={H Company},
year={2026},
url={https://huggingface.co/Hcompany/Holo3-35B-A3B},
}
- Downloads last month
- -
Model tree for prithivMLmods/Holo3-35B-A3B-FP8
Base model
Qwen/Qwen3.5-35B-A3B-Base