File size: 1,430 Bytes
d7f8a47 b2204d1 d7f8a47 b6dd802 d7f8a47 b6dd802 b2204d1 b6dd802 b2204d1 b6dd802 b2204d1 b6dd802 b2204d1 b6dd802 b2204d1 b6dd802 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
---
title: AI Chatbot File Web Image Audio
emoji: π€
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 5.34.2
app_file: app.py
pinned: false
license: mit
short_description: AI Chatbot using RAG from Files, URLs, Images & Audio
---
# π§ AI Chatbot with File, Web, Image & Audio Support (Gradio + Groq)
A multimodal AI assistant powered by Groq's LLaMA 3 that can answer questions using:
- π Uploaded documents (`.txt`, `.pdf`, `.docx`, `.csv`)
- π Any public website URL (RAG retrieval)
- πΌοΈ Images via OCR (Tesseract)
- π§ Audio files via transcription (Whisper)
---
## π Features
- Chat with files (PDF, DOCX, TXT, CSV)
- Question answering from website content
- OCR-based text extraction from images
- Speech-to-text from audio recordings
- Maintains separate history for File & URL chat sessions
---
## π οΈ Tech Stack
- [Gradio](https://gradio.app) β User Interface
- [FastAPI](https://fastapi.tiangolo.com/) β API Backend
- [Groq API](https://groq.com/) β LLaMA 3 inference
- [Tesseract OCR](https://github.com/tesseract-ocr) β Image text extraction
- [Whisper](https://github.com/openai/whisper) β Audio transcription
---
## π¦ How to Run Locally
```bash
git clone https://github.com/your-username/your-repo.git
cd your-repo
# Install dependencies
pip install -r requirements.txt
# Start FastAPI backend
uvicorn main:app --reload
# Run Gradio frontend
python app.py
|