File size: 750 Bytes
09b2c2d 0bfba7a 09b2c2d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | ---
library_name: transformers
tags:
- vision
- image-reconstruction
- siglip2
- safetensors
---
# F2P Decoder
Hugging Face `AutoModel` wrapper for the SigLIP2 feature-to-pixel decoder used in this repository.
```python
import torch
from transformers import AutoModel
model = AutoModel.from_pretrained(
"toilaluan/f2p_decoder",
trust_remote_code=True,
).eval()
features = torch.randn(1, 257, 1152)
reconstruction = model(features)
print(reconstruction.shape) # (1, 3, 224, 224)
```
The model expects SigLIP2 patch features with a CLS token, for example from
`google/siglip2-so400m-patch14-224`. The output is an image tensor in the
decoder's reconstructed pixel space.
Source `.pt` checkpoint: `nyu-visionx/siglip2_decoder/model.pt`.
|