YifanXu
/

libra-vision-tokenizer

Model card Files Files and versions

libra-vision-tokenizer / README.md

YifanXu's picture

Update README.md

3ea7ba7 verified almost 2 years ago

|

history blame contribute delete

965 Bytes

	---
	license: apache-2.0
	---
	## Libra Vision Tokenizer

	[Libra: Building Decoupled Vision System on Large Language Models](https://arxiv.org/abs/2405.10140)

	This repo provides the pretrained weight of Libra vision tokenizer trained with lookup-free quantization.

	### !!! NOTE !!!
	1. Please merge the weights into ``llama-2-7b-chat-hf-libra`` ([huggingface version of LLaMA2-7B-Chat](https://huggingface.co/docs/transformers/main/model_doc/llama2)).

	2. Please download the pretrained CLIP model in huggingface and merge it into the path. The CLIP model can be downloaded [here](https://huggingface.co/openai/clip-vit-large-patch14-336).


	The files should be organized as:

	```
	llama-2-7b-chat-hf-libra/
	\|
	│ # original llama files
	\|
	├── ...
	│
	│ # newly added vision tokenizer
	│
	├── vision_tokenizer_config.yaml
	├── vqgan.ckpt
	│
	│ # CLIP model
	│
	└── openai-clip-vit-large-patch14-336/
	└── ...
	```