|
|
--- |
|
|
tags: |
|
|
- vision |
|
|
- image-to-text |
|
|
- pruning |
|
|
- llava |
|
|
base_model: llava-hf/llava-1.5-7b-hf |
|
|
--- |
|
|
|
|
|
# llava-glu-30pct |
|
|
|
|
|
This is a pruned version of [LLaVA-1.5-7b](https://huggingface.co/llava-hf/llava-1.5-7b-hf). |
|
|
|
|
|
## Pruning Details |
|
|
- **Method**: GLU Pruning |
|
|
- **Sparsity**: 30% |
|
|
|
|
|
This model was pruned to improve efficiency while maintaining performance. |
|
|
|
|
|
## Usage |
|
|
|
|
|
Since this model was pruned structurally, the architecture remains compatible with the standard `LlavaForConditionalGeneration` class. However, you should use the processor from the base model to ensure correct input preprocessing. |
|
|
|
|
|
```python |
|
|
from transformers import AutoProcessor, LlavaForConditionalGeneration |
|
|
import torch |
|
|
|
|
|
model_id = "CrystalRaindropsFall/llava-glu-30pct" |
|
|
base_model_id = "llava-hf/llava-1.5-7b-hf" |
|
|
|
|
|
# 1. Load the processor from the base model |
|
|
processor = AutoProcessor.from_pretrained(base_model_id) |
|
|
|
|
|
# 2. Load the pruned model |
|
|
model = LlavaForConditionalGeneration.from_pretrained( |
|
|
model_id, |
|
|
torch_dtype=torch.float16, |
|
|
device_map="auto" |
|
|
) |
|
|
|
|
|
# Example inference |
|
|
from PIL import Image |
|
|
import requests |
|
|
|
|
|
url = "https://github.com/haotian-liu/LLaVA/blob/1a91fc274d7c35a9b50b3cb29c4247ae5837ce39/images/llava_logo.png?raw=true" |
|
|
image = Image.open(requests.get(url, stream=True).raw) |
|
|
prompt = "USER: <image>\nWhat is shown in this image?\nASSISTANT:" |
|
|
|
|
|
inputs = processor(images=image, text=prompt, return_tensors="pt").to(model.device, model.dtype) |
|
|
|
|
|
output = model.generate(**inputs, max_new_tokens=100, do_sample=False) |
|
|
print(processor.decode(output[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|