Libra
Collection
The official repo for the ICML2024 paper: Libra: Building Decoupled Vision System on Large Language Models β’ 3 items β’ Updated
How to use YifanXu/libra-11b-chat with Transformers:
# Use a pipeline as a high-level helper
# Warning: Pipeline type "image-to-text" is no longer supported in transformers v5.
# You must load the model directly (see below) or downgrade to v4.x with:
# 'pip install "transformers<5.0.0'
from transformers import pipeline
pipe = pipeline("image-to-text", model="YifanXu/libra-11b-chat") # Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("YifanXu/libra-11b-chat", dtype="auto")Libra: Building Decoupled Vision System on Large Language Models
This model was further finetuned with instructions based on Libra-Base for multi-modal chat.
In addition to the pretrained weights in this repo, please download the pretrained CLIP model in huggingface and merge it into the path, as:
libra-chat/
βββ ...
βββ openai-clip-vit-large-patch14-336/
βββ ...
The CLIP model can be downloaded here.