Transformers
Safetensors
vision-encoder-decoder