HuggingFaceM4/ChartQA
Viewer • Updated • 32.7k • 13.5k • 65
How to use NLP-Orange-Problem/AttentionSeekers with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolVLM2-2.2B-Instruct")
model = PeftModel.from_pretrained(base_model, "NLP-Orange-Problem/AttentionSeekers")This is a LoRA adapter for SmolVLM2-2.2B-Instruct, fine-tuned on the ChartQA dataset to answer questions about charts and graphs.
GitHub repository: NLP_Orange_Problem_AttentionSeekers
| Component | Details |
|---|---|
| Base model | HuggingFaceTB/SmolVLM2-2.2B-Instruct |
| Fine-tuning method | LoRA (r=16, alpha=32) |
| Dataset | ChartQA (1000 training samples) |
| Training hardware | Kaggle 2x T4 (32 GB VRAM) |
| Final training loss | 1.855 |
| Epochs | 2 |
pip install transformers peft accelerate torch pillow
import torch
from PIL import Image
from peft import PeftModel
from transformers import AutoProcessor, AutoModelForImageTextToText
MODEL_PATH = "HuggingFaceTB/SmolVLM2-2.2B-Instruct"
ADAPTER_PATH = "pes1ug23am016/smolvlm2-chartqa-lora"
# 1. Load base model
model = AutoModelForImageTextToText.from_pretrained(
MODEL_PATH,
torch_dtype=torch.float16,
device_map="auto",
attn_implementation="eager"
)
processor = AutoProcessor.from_pretrained(ADAPTER_PATH)
# 2. Load and merge LoRA adapters
model = PeftModel.from_pretrained(model, ADAPTER_PATH)
model = model.merge_and_unload()
model.eval()
# 3. Run inference
def predict(image, question):
messages = [
{"role": "user", "content": [
{"type": "image"},
{"type": "text", "text": question}
]}
]
prompt = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=prompt, images=[[image]], return_tensors="pt").to(model.device)
with torch.no_grad():
out = model.generate(**inputs, max_new_tokens=64, do_sample=False)
new_tokens = out[0][inputs["input_ids"].shape[1]:]
return processor.tokenizer.decode(new_tokens, skip_special_tokens=True).strip()
# Example
image = Image.open("your_chart.png")
answer = predict(image, "What is the highest value in the chart?")
print(answer)
r = 16lora_alpha = 32lora_dropout = 0.05target_modules: q_proj, k_proj, v_proj, o_projBase model
HuggingFaceTB/SmolLM2-1.7B