Scene Atlas Models
Collection
ONNX-exported and quantised vision-language models for Scene Atlas. These are derivative works — see individual model cards for attribution. • 3 items • Updated
This is an ONNX export and quantisation of laion/CLIP-ViT-B-32-laion2B-s34B-b79K, not an original model.
All credit for the original model goes to its authors. This repo exists solely to host pre-exported ONNX variants for use by Scene Atlas, a tabletop RPG scene management tool.
| Encoder | Quantisation | Size |
|---|---|---|
| Vision | fp16 | 167.8 MB |
| Text | int8 | 89.1 MB |
| Total | 256.8 MB |
| Parameter | Value |
|---|---|
model_family |
clip |
embedding_dim |
512 |
image_size |
224 |
image_mean |
0.4815, 0.4578, 0.4082 |
image_std |
0.2686, 0.2613, 0.2758 |
interpolation |
bicubic |
resize_mode |
shortest_edge_then_crop |
tokenizer_type |
huggingface |
tokenizer_max_length |
77 |
These models are intended for use with Scene Atlas. The repo contains
clip_vision_encoder.onnx, clip_text_encoder.onnx, manifest.json,
and a tokenizer/ directory — all at the repo root.
from huggingface_hub import hf_hub_download
# Download encoder files
hf_hub_download(
repo_id="jennis0/scene-atlas-small",
filename="clip_vision_encoder.onnx",
)
Please refer to the original model card for licensing information.
Base model
laion/CLIP-ViT-B-32-laion2B-s34B-b79K