| base_model: | |
| - meta-llama/Llama-3.2-3B-Instruct | |
| - meta-llama/Llama-3.1-8B-Instruct | |
| - meta-llama/Llama-3.1-70B-Instruct | |
| - openai/clip-vit-large-patch14 | |
| library_name: transformers | |
| license: cc | |
| pipeline_tag: image-text-to-text | |
| tags: | |
| - pytorch | |
| - llama-3 | |
| - zero-shot-vision-encoder-grafting | |
| ## Zero-Shot Vision Encoder Grafting via LLM Surrogates | |
| This repository contains a vision encoder which is described in [Zero-Shot Vision Encoder Grafting via LLM Surrogates](https://huggingface.co/papers/2505.22664). | |
| <p align="left" style="display: flex; gap: 8px; align-items: center;"> | |
| <a href="https://arxiv.org/abs/2505.22664"><img src="https://img.shields.io/badge/arXiv%20-paper-b31b1b.svg?style=flat-square" /></a> | |
| <a href="https://github.com/facebookresearch/zero"><img src="https://img.shields.io/badge/GitHub%20-facebookresearch/zero-0081fb.svg?style=flat-square" /></a> | |
| <a href="https://github.com/facebookresearch/zero/blob/main/LICENSE"><img src="https://img.shields.io/badge/License-CC--BY--NC%204.0-black.svg?style=flat-square" /></a> | |
| </p> |