UBC-NLP
/

Peacock

 - ar
 ---
+# Peacock
+Peacock is  InstructBLIP model using AraLLaMA as language model. Peacock was introduced in the paper [Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks]((https://arxiv.org/abs/2403.01031)).
+# How to use
+Usage is as follows:
+```
+from transformers import InstructBlipProcessor, InstructBlipForConditionalGeneration
+import torch
+from PIL import Image
+import requests
+model = InstructBlipForConditionalGeneration.from_pretrained("Fakhraddin/InstructBlip-AraLLaMA")
+processor = InstructBlipProcessor.from_pretrained("Fakhraddin/InstructBlip-AraLLaMA")
+device = "cuda" if torch.cuda.is_available() else "cpu"
+model.to(device)
+url = "https://raw.githubusercontent.com/salesforce/LAVIS/main/docs/_static/Confusing-Pictures.jpg"
+image = Image.open(requests.get(url, stream=True).raw).convert("RGB")
+prompt = "What is unusual about this image?"
+inputs = processor(images=image, text=prompt, return_tensors="pt").to(device)
+outputs = model.generate(
+        **inputs,
+        do_sample=False,
+        num_beams=5,
+        max_length=256,
+        min_length=1,
+        top_p=0.9,
+        repetition_penalty=1.5,
+        length_penalty=1.0,
+        temperature=1,
+)
+generated_text = processor.batch_decode(outputs, skip_special_tokens=True)[0].strip()
+print(generated_text)
+```