Instructions to use allenai/MolmoWeb-4B-Native with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use allenai/MolmoWeb-4B-Native with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="allenai/MolmoWeb-4B-Native")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("allenai/MolmoWeb-4B-Native", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use allenai/MolmoWeb-4B-Native with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "allenai/MolmoWeb-4B-Native" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "allenai/MolmoWeb-4B-Native", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/allenai/MolmoWeb-4B-Native
- SGLang
How to use allenai/MolmoWeb-4B-Native with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "allenai/MolmoWeb-4B-Native" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "allenai/MolmoWeb-4B-Native", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "allenai/MolmoWeb-4B-Native" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "allenai/MolmoWeb-4B-Native", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use allenai/MolmoWeb-4B-Native with Docker Model Runner:
docker model run hf.co/allenai/MolmoWeb-4B-Native
MolmoWeb-4B-Native
Note that this is the molmo-native checkpoint, and it's NOT Huggingface/transformers-compatible. Check out allenai/MolmoWeb-4B for HF-compatible checkpoint.
MolmoWeb is a family of fully open multimodal web agents. MolmoWeb agents achieve state-of-the-art results outperforming similar scale open-weight-only models such as Fara-7B, UI-Tars-1.5-7B, and Holo1-7B. MolmoWeb-8B also surpasses set-of-marks (SoM) agents built on much larger closed frontier models like GPT-4o. We further demonstrate consistent gains through test-time scaling via parallel rollouts with best-of-N selection, achieving 94.7% and 60.5% pass@4 (compared to 78.2% and 35.3% pass@1)on WebVoyager and Online-Mind2Web respectively.
Learn more about the MolmoWeb family in our announcement blog post and tech report.
MolmoWeb-4B-Native is based on Molmo2 architecture, which uses Qwen3-8B and SigLIP 2 as vision backbone.
Ai2 is committed to open science. The MolmoWeb datasets are available here. All other artifacts used in creating MolmoWeb (training code, evaluations, intermediate checkpoints) will be made available, furthering our commitment to open-source AI development and reproducibility.
Quick links:
- 💬 Demo
- 📂 All Models
- 📚 All Data
- 📃 Paper
- 🎥 Blog with Videos
Usage
Please refer to our Github repo for inference code.
License and Use
This model is licensed under Apache 2.0. It is intended for research and educational use in accordance with Ai2’s Responsible Use Guidelines.
Citation
If you use this dataset, please cite:
@misc{gupta2026molmowebopenvisualweb,
title={MolmoWeb: Open Visual Web Agent and Open Data for the Open Web},
author={Tanmay Gupta and Piper Wolters and Zixian Ma and Peter Sushko and Rock Yuren Pang and Diego Llanes and Yue Yang and Taira Anderson and Boyuan Zheng and Zhongzheng Ren and Harsh Trivedi and Taylor Blanton and Caleb Ouellette and Winson Han and Ali Farhadi and Ranjay Krishna},
year={2026},
eprint={2604.08516},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2604.08516},
}
- Downloads last month
- 62
docker model run hf.co/allenai/MolmoWeb-4B-Native