AMoE-Dense-L

Accepted at CVPR 2026

Project Website arXiv GitHub

Large dense variant of AMoE. 0.6B parameters.

Part of the AMoE model family.

Usage

import torch
from PIL import Image
from transformers import AutoModel, AutoImageProcessor

model_id = "tiiuae/amoe-dense-L"
model = AutoModel.from_pretrained(model_id, trust_remote_code=True).to("cuda", dtype=torch.bfloat16)
processor = AutoImageProcessor.from_pretrained(model_id, trust_remote_code=True)

image = Image.open("image.jpg").convert("RGB")
inputs = processor(image, return_tensors="pt").to("cuda")
inputs["pixel_values"] = inputs["pixel_values"].to(torch.bfloat16)

with torch.no_grad():
    outputs = model(**inputs)

# Options: 'amoe' (1280d), 'siglip2' (1152d), 'dinov3' (1024d)
patch_features = outputs["patch_features"]["amoe"]         # (Batch, Tokens, 1280)
summary_features = outputs["summary_features"]["siglip2"]  # (Batch, 1152)

Model Details

Property Value
Architecture Dense
Parameters 0.6B
Layers 18
Hidden Dim 1280
FFN Dim 5120
Patch Size 16x16
Teachers DINOv3, SigLIP2

Citation

@article{chaybouti2025amoe,
  title={AMOE: Agglomerative Mixture-of-Experts Vision Foundation Models},
  author={Chaybouti, Sofian and Narayan, Sanath and Dahou, Yasser and Le Khac, Phuc H. and Singh, Ankit and Huynh, Ngoc Dung and Para, Wamiq Reyaz and Kuehne, Hilde and Hacid, Hakim},
  journal={arXiv preprint arXiv:2512.20157},
  year={2025}
}
Downloads last month
31
Safetensors
Model size
0.5B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including tiiuae/amoe-dense-L

Paper for tiiuae/amoe-dense-L