Scene Atlas Models
Collection
ONNX-exported and quantised vision-language models for Scene Atlas. These are derivative works — see individual model cards for attribution. • 3 items • Updated
This is an ONNX export and quantisation of google/siglip2-so400m-patch14-384, not an original model.
All credit for the original model goes to its authors. This repo exists solely to host pre-exported ONNX variants for use by Scene Atlas, a tabletop RPG scene management tool.
| Encoder | Quantisation | Size |
|---|---|---|
| Vision | fp16 | 817.3 MB |
| Text | fp16 | 1350.5 MB |
| Total | 2167.8 MB |
| Parameter | Value |
|---|---|
model_family |
siglip2 |
embedding_dim |
1152 |
image_size |
384 |
image_mean |
0.5000, 0.5000, 0.5000 |
image_std |
0.5000, 0.5000, 0.5000 |
interpolation |
bilinear |
resize_mode |
direct_resize |
tokenizer_type |
sentencepiece |
tokenizer_max_length |
64 |
These models are intended for use with Scene Atlas. The repo contains
clip_vision_encoder.onnx, clip_text_encoder.onnx, manifest.json,
and a tokenizer/ directory — all at the repo root.
from huggingface_hub import hf_hub_download
# Download encoder files
hf_hub_download(
repo_id="jennis0/scene-atlas-large",
filename="clip_vision_encoder.onnx",
)
Please refer to the original model card for licensing information.
Base model
google/siglip2-so400m-patch14-384