File size: 447 Bytes
4079a01
 
 
 
 
 
 
 
 
 
 
 
cd469b0
1
2
3
4
5
6
7
8
9
10
11
12
13
14

# 🏛️ RAG Image Captioning with Landmark Location

This model generates captions for monument/landmark images using a retrieval-augmented generation approach.

## How it works:
- Uses CLIP to extract image embeddings.
- Retrieves top-k similar captions via FAISS.
- Generates a detailed caption with name and location using T5.

## Example
Input: 🏰 Image of the Taj Mahal  
Output: _" is a white marble mausoleum located in Agra, India."_