--- license: apache-2.0 --- ## Libra Vision Tokenizer [**Libra: Building Decoupled Vision System on Large Language Models**](https://arxiv.org/abs/2405.10140) This repo provides the pretrained weight of Libra vision tokenizer trained with lookup-free quantization. ### !!! NOTE !!! 1. Please merge the weights into ``llama-2-7b-chat-hf-libra`` ([huggingface version of LLaMA2-7B-Chat](https://huggingface.co/docs/transformers/main/model_doc/llama2)). 2. Please download the pretrained CLIP model in huggingface and merge it into the path. The CLIP model can be downloaded [here](https://huggingface.co/openai/clip-vit-large-patch14-336). The files should be organized as: ``` llama-2-7b-chat-hf-libra/ | │ # original llama files | ├── ... │ │ # newly added vision tokenizer │ ├── vision_tokenizer_config.yaml ├── vqgan.ckpt │ │ # CLIP model │ └── openai-clip-vit-large-patch14-336/ └── ... ```