VisualRAG / README.md
Faraz618's picture
Update README.md
66b81c5 verified
|
Raw
History Blame Contribute Delete
1.16 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade
metadata
title: VisualRAG
emoji: πŸ”
colorFrom: purple
colorTo: green
sdk: gradio
sdk_version: 4.40.0
python_version: '3.10'
app_file: app.py
pinned: false
license: mit

πŸ” VisualRAG β€” Multi-Modal AI System

A production-grade Retrieval-Augmented Generation (RAG) system combining computer vision and natural language understanding.

🧠 Pipeline

Index: Image β†’ YOLOv8 detection β†’ CLIP ViT-B/32 embedding β†’ FAISS vector store
Query: Text β†’ CLIP text embedding β†’ cosine k-NN β†’ Zephyr-7B answer generation

πŸ›  Stack

Component Technology
Object detection YOLOv8n (Ultralytics)
Visual embedding CLIP ViT-B/32 (OpenAI)
Vector index FAISS IndexFlatIP
LLM Zephyr-7B-Ξ² (HF Serverless API)
UI Gradio 4.40.0

πŸš€ How to use

  1. Detect & Index β€” upload images; YOLOv8 detects objects, CLIP stores 512-d embeddings in FAISS
  2. Query (RAG) β€” ask a question; CLIP retrieves relevant images, Zephyr-7B answers
  3. How it works β€” full architecture overview

πŸ”‘ Optional: HF token

Settings β†’ Variables and secrets β†’ New secret β†’ Name: HF_TOKEN