BLIP Fine-tuned on Car Damage Captioning

This is a BLIP (Bootstrapped Language-Image Pretraining) model that has been fine-tuned on a car damage image captioning dataset.

The model is based on: ➡️ Salesforce/blip-image-captioning-base

and fine-tuned on the dataset from:

https://www.kaggle.com/datasets/gabrielfcarvalho/blip-for-captioning-car-damage

📌 Model Description

This model takes an input image of a car (possibly damaged) and generates a descriptive caption.
It was fine-tuned to better understand damage patterns, parts of cars, and limitations of base BLIP in this domain.

Input

An image of a car (JPEG/PNG).

Output

A textual caption describing the image content, particularly focusing on:

Damage types
Damaged parts
Severity hints

📂 Dataset

The training dataset used:
➡️ BLIP for Captioning Car Damage (Kaggle):
https://www.kaggle.com/datasets/gabrielfcarvalho/blip-for-captioning-car-damage

It contains car images labeled with human-written captions that describe damage.

🧠 Fine-tuning

This model was fine-tuned starting from: ➡️ Salesforce/blip-image-captioning-base

Downloads last month: 3

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for thaiphonghuan/BLIP-finetuned-car-damage

Base model

Salesforce/blip-image-captioning-base

Finetuned

(49)

this model