AMoE-Dense-L

Accepted at CVPR 2026

Large dense variant of AMoE. 0.6B parameters.

Usage

import torch
from PIL import Image
from transformers import AutoModel, AutoImageProcessor

model_id = "tiiuae/amoe-dense-L"
model = AutoModel.from_pretrained(model_id, trust_remote_code=True).to("cuda", dtype=torch.bfloat16)
processor = AutoImageProcessor.from_pretrained(model_id, trust_remote_code=True)

image = Image.open("image.jpg").convert("RGB")
inputs = processor(image, return_tensors="pt").to("cuda")
inputs["pixel_values"] = inputs["pixel_values"].to(torch.bfloat16)

with torch.no_grad():
    outputs = model(**inputs)

# Options: 'amoe' (1280d), 'siglip2' (1152d), 'dinov3' (1024d)
patch_features = outputs["patch_features"]["amoe"]         # (Batch, Tokens, 1280)
summary_features = outputs["summary_features"]["siglip2"]  # (Batch, 1152)

Model Details

Property	Value
Architecture	Dense
Parameters	0.6B
Layers	18
Hidden Dim	1280
FFN Dim	5120
Patch Size	16x16
Teachers	DINOv3, SigLIP2

Citation

@article{chaybouti2025amoe,
  title={AMOE: Agglomerative Mixture-of-Experts Vision Foundation Models},
  author={Chaybouti, Sofian and Narayan, Sanath and Dahou, Yasser and Le Khac, Phuc H. and Singh, Ankit and Huynh, Ngoc Dung and Para, Wamiq Reyaz and Kuehne, Hilde and Hacid, Hakim},
  journal={arXiv preprint arXiv:2512.20157},
  year={2025}
}

Downloads last month: 31

Safetensors

Model size

0.5B params

Tensor type

F32

Collection including tiiuae/amoe-dense-L

AMoE: Agglomerative MoE Vision Foundation Models

Collection

CVPR 2026. A family of vision encoders distilled from DINOv3 and SigLIP2, available in MoE and dense variants. • 4 items • Updated 4 days ago • 1

Paper for tiiuae/amoe-dense-L

AMoE: Agglomerative Mixture-of-Experts Vision Foundation Model

Paper • 2512.20157 • Published Dec 23, 2025 • 2