| | --- |
| | datasets: |
| | - Dataseeds/DataSeeds.AI-Sample-Dataset-DSD |
| | language: |
| | - en |
| | pipeline_tag: image-text-to-text |
| | --- |
| | |
| | ## πΌοΈ BLIP Image Captioning β Finetuned (`candra/blip-image-captioning-finetuned`) |
| |
|
| | This model is a **BLIP (Bootstrapping Language-Image Pretraining)** model fine-tuned for image captioning. It takes an image as input and generates a descriptive caption. Additionally, it can convert that caption into cleaned, hashtag-friendly keywords. |
| |
|
| | ### π§ Model Details |
| |
|
| | * **Base model**: [`Salesforce/blip-image-captioning-base`](https://huggingface.co/Salesforce/blip-image-captioning-base) |
| | * **Task**: Image Captioning |
| |
|
| | --- |
| |
|
| | ## π§ͺ Example Usage |
| |
|
| | ```python |
| | from transformers import AutoProcessor, BlipForConditionalGeneration |
| | import torch |
| | from PIL import Image |
| | |
| | # Load model and processor |
| | processor = AutoProcessor.from_pretrained("candra/blip-image-captioning-finetuned") |
| | model = BlipForConditionalGeneration.from_pretrained("candra/blip-image-captioning-finetuned") |
| | |
| | # Load image |
| | image_path = "IMAGE.jpg" |
| | image = Image.open(image_path).convert("RGB") |
| | |
| | # Set device |
| | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
| | model = model.to(device) |
| | |
| | # Preprocess and generate caption |
| | inputs = processor(images=image, return_tensors="pt") |
| | pixel_values = inputs.pixel_values.to(device) |
| | |
| | generated_ids = model.generate(pixel_values=pixel_values, max_length=50) |
| | generated_caption = processor.batch_decode(generated_ids, skip_special_tokens=True)[0] |
| | print("Caption:", generated_caption) |
| | |
| | # Convert caption to hashtags |
| | words = generated_caption.lower().split(", ") |
| | unique_words = sorted(set(words)) |
| | hashtags = ["#" + word.replace(" ", "") for word in unique_words] |
| | print("Hashtags:", " ".join(hashtags)) |
| | ``` |
| |
|
| | --- |
| |
|
| | ## π₯ Input |
| |
|
| | * **Image** (RGB format, e.g., `.jpg`, `.png`) |
| |
|
| | ## π€ Output |
| |
|
| | * **Caption**: A string describing the contents of the image. |
| | * **Hashtags**: A list of unique hashtags derived from the caption. |
| |
|
| | --- |
| |
|
| | ## π Example |
| |
|
| | **Input Image** |
| | <img src="lion.jpg" alt="Example Image" width="500"/> |
| |
|
| | **Generated Caption** |
| |
|
| | ``` |
| | animal, lion, mammal, wildlife, zoo, barrel, grass, backgound |
| | ``` |
| |
|
| | **Hashtags** |
| |
|
| | ``` |
| | #animal #lion #mammal #wildlife #zoo #barrel #grass #backgound |
| | ``` |