--- library_name: transformers tags: - vision - image-reconstruction - siglip2 - safetensors --- # F2P Decoder Hugging Face `AutoModel` wrapper for the SigLIP2 feature-to-pixel decoder used in this repository. ```python import torch from transformers import AutoModel model = AutoModel.from_pretrained( "toilaluan/f2p_decoder", trust_remote_code=True, ).eval() features = torch.randn(1, 257, 1152) reconstruction = model(features) print(reconstruction.shape) # (1, 3, 224, 224) ``` The model expects SigLIP2 patch features with a CLS token, for example from `google/siglip2-so400m-patch14-224`. The output is an image tensor in the decoder's reconstructed pixel space. Source `.pt` checkpoint: `nyu-visionx/siglip2_decoder/model.pt`.