BLIP Fine-tuned on Car Damage Captioning

This is a BLIP (Bootstrapped Language-Image Pretraining) model that has been fine-tuned on a car damage image captioning dataset.

The model is based on: ➑️ Salesforce/blip-image-captioning-base

and fine-tuned on the dataset from:

https://www.kaggle.com/datasets/gabrielfcarvalho/blip-for-captioning-car-damage

πŸ“Œ Model Description

This model takes an input image of a car (possibly damaged) and generates a descriptive caption.
It was fine-tuned to better understand damage patterns, parts of cars, and limitations of base BLIP in this domain.

Input

An image of a car (JPEG/PNG).

Output

A textual caption describing the image content, particularly focusing on:

  • Damage types
  • Damaged parts
  • Severity hints

πŸ“‚ Dataset

The training dataset used:
➑️ BLIP for Captioning Car Damage (Kaggle):
https://www.kaggle.com/datasets/gabrielfcarvalho/blip-for-captioning-car-damage

It contains car images labeled with human-written captions that describe damage.

🧠 Fine-tuning

This model was fine-tuned starting from: ➑️ Salesforce/blip-image-captioning-base

Downloads last month
24
Safetensors
Model size
0.2B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for thaiphonghuan/BLIP-finetuned-car-damage

Finetuned
(43)
this model