| | --- |
| | language: ar |
| | license: other |
| | tags: |
| | - vision |
| | - image-captioning |
| | pipeline_tag: image-to-text |
| | --- |
| | |
| | # 🦚 Peacock |
| | 🦚 Peacock is an InstructBLIP based-model that uses AraLLaMA as its language model. It was introduced in the paper [Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks](https://arxiv.org/abs/2403.01031). |
| |
|
| | # How to use |
| |
|
| | Usage is as follows: |
| |
|
| | ```python |
| | from transformers import InstructBlipProcessor, InstructBlipForConditionalGeneration |
| | import torch |
| | from PIL import Image |
| | import requests |
| | |
| | model = InstructBlipForConditionalGeneration.from_pretrained("UBC-NLP/Peacock") |
| | processor = InstructBlipProcessor.from_pretrained("UBC-NLP/Peacock") |
| | device = "cuda" if torch.cuda.is_available() else "cpu" |
| | model.to(device) |
| | |
| | url = "https://upload.wikimedia.org/wikipedia/commons/8/83/Socotra_dragon_tree.JPG" |
| | image = Image.open(requests.get(url, stream=True).raw).convert("RGB") |
| | prompt = "اوصف الصوره" |
| | |
| | inputs = processor(images=image, text=prompt, return_tensors="pt").to(device) |
| | outputs = model.generate( |
| | **inputs, |
| | do_sample=False, |
| | num_beams=5, |
| | max_length=256, |
| | min_length=1, |
| | top_p=0.9, |
| | repetition_penalty=1.5, |
| | length_penalty=1.0, |
| | temperature=1, |
| | ) |
| | generated_text = processor.batch_decode(outputs, skip_special_tokens=True)[0].strip() |
| | print(generated_text) |
| | ``` |
| | # Citation |
| |
|
| | If you use this model, please cite the following paper: |
| |
|
| |
|
| | ```bibtex |
| | @inproceedings{alwajih2024peacock, |
| | title = {Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks}, |
| | author = {Alwajih, Fakhraddin and Nagoudi, El Moatez Billah and Bhatia, Gagan and Mohamed, Abdelrahman and Abdul-Mageed, Muhammad}, |
| | booktitle = {Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)}, |
| | pages = {12753--12776}, |
| | year = {2024}, |
| | address = {Bangkok, Thailand}, |
| | publisher = {Association for Computational Linguistics}, |
| | url = {https://aclanthology.org/2024.acl-long.689} |
| | } |
| | ``` |