Image-to-Text
Transformers
Safetensors
PyTorch
blip
image-text-to-text
image-captioning
vision-language-model
multimodal-ai
computer-vision
deep-learning
Instructions to use YaekobB/blip-caption-model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use YaekobB/blip-caption-model with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("image-to-text", model="YaekobB/blip-caption-model")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("YaekobB/blip-caption-model") model = AutoModelForImageTextToText.from_pretrained("YaekobB/blip-caption-model") - Notebooks
- Google Colab
- Kaggle
| license: mit | |
| tags: | |
| - image-captioning | |
| - blip | |
| - vision-language-model | |
| - multimodal-ai | |
| - computer-vision | |
| - deep-learning | |
| - transformers | |
| - pytorch | |
| pipeline_tag: image-to-text | |
| library_name: transformers | |
| # BLIP Caption Model | |
| This repository contains a BLIP-based image captioning model used to generate natural-language captions from uploaded images. | |
| The model is connected to a live Hugging Face Space demo: | |
| 👉 [Multimodal Image Captioning with BLIP Demo](https://huggingface.co/spaces/YaekobB/image-captioning-blip-demo) | |
| ## Model Description | |
| This model is designed for automatic image captioning. Given an input image, it generates a short textual description of the visual content. | |
| The project demonstrates the use of vision-language models for multimodal AI applications, combining computer vision and natural language generation. | |
| ## Intended Use | |
| This model can be used for: | |
| - Image caption generation | |
| - Vision-language AI demonstrations | |
| - Multimodal learning experiments | |
| - Educational and portfolio projects | |
| - Prototyping image-to-text applications | |
| ## How to Use | |
| ```python | |
| from transformers import BlipProcessor, BlipForConditionalGeneration | |
| from PIL import Image | |
| import torch | |
| model_id = "YaekobB/blip-caption-model" | |
| processor = BlipProcessor.from_pretrained(model_id) | |
| model = BlipForConditionalGeneration.from_pretrained(model_id) | |
| image = Image.open("your_image.jpg").convert("RGB") | |
| inputs = processor(image, return_tensors="pt") | |
| with torch.no_grad(): | |
| output = model.generate(**inputs, max_new_tokens=50) | |
| caption = processor.decode(output[0], skip_special_tokens=True) | |
| print(caption) | |
| ``` | |
| ## Live Demo | |
| A live inference demo is available on Hugging Face Spaces: | |
| [https://huggingface.co/spaces/YaekobB/image-captioning-blip-demo](https://huggingface.co/spaces/YaekobB/image-captioning-blip-demo) | |
| The demo allows users to upload one or more images and generate captions using the model. | |
| ## Limitations | |
| This model may generate inaccurate or incomplete captions, especially for: | |
| - Complex scenes with many objects or people | |
| - Small or unclear objects | |
| - Low-quality or blurry images | |
| - Culturally specific contexts | |
| - Images requiring detailed reasoning or domain expertise | |
| Generated captions should be treated as model-generated descriptions, not guaranteed factual annotations. | |
| ## Ethical Considerations | |
| This model should not be used as the sole source of truth for safety-critical, medical, legal, or identity-sensitive decisions. | |
| It may produce biased, incomplete, or incorrect descriptions depending on the input image and training data limitations. | |
| ## Author | |
| **Yaekob Beyene Yowhanns** | |
| M.Sc. Artificial Intelligence and Computer Science | |
| University of Calabria | |
| GitHub: [yaekobB](https://github.com/yaekobB) | |
| Hugging Face: [YaekobB](https://huggingface.co/YaekobB) |