|
|
--- |
|
|
title: AI Chatbot File Web Image Audio |
|
|
emoji: π€ |
|
|
colorFrom: indigo |
|
|
colorTo: pink |
|
|
sdk: gradio |
|
|
sdk_version: 5.34.2 |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
license: mit |
|
|
short_description: AI Chatbot using RAG from Files, URLs, Images & Audio |
|
|
--- |
|
|
|
|
|
# π§ AI Chatbot with File, Web, Image & Audio Support (Gradio + Groq) |
|
|
|
|
|
A multimodal AI assistant powered by Groq's LLaMA 3 that can answer questions using: |
|
|
|
|
|
- π Uploaded documents (`.txt`, `.pdf`, `.docx`, `.csv`) |
|
|
- π Any public website URL (RAG retrieval) |
|
|
- πΌοΈ Images via OCR (Tesseract) |
|
|
- π§ Audio files via transcription (Whisper) |
|
|
|
|
|
--- |
|
|
|
|
|
## π Features |
|
|
|
|
|
- Chat with files (PDF, DOCX, TXT, CSV) |
|
|
- Question answering from website content |
|
|
- OCR-based text extraction from images |
|
|
- Speech-to-text from audio recordings |
|
|
- Maintains separate history for File & URL chat sessions |
|
|
|
|
|
--- |
|
|
|
|
|
## π οΈ Tech Stack |
|
|
|
|
|
- [Gradio](https://gradio.app) β User Interface |
|
|
- [FastAPI](https://fastapi.tiangolo.com/) β API Backend |
|
|
- [Groq API](https://groq.com/) β LLaMA 3 inference |
|
|
- [Tesseract OCR](https://github.com/tesseract-ocr) β Image text extraction |
|
|
- [Whisper](https://github.com/openai/whisper) β Audio transcription |
|
|
|
|
|
--- |
|
|
|
|
|
## π¦ How to Run Locally |
|
|
|
|
|
```bash |
|
|
git clone https://github.com/your-username/your-repo.git |
|
|
cd your-repo |
|
|
|
|
|
# Install dependencies |
|
|
pip install -r requirements.txt |
|
|
|
|
|
# Start FastAPI backend |
|
|
uvicorn main:app --reload |
|
|
|
|
|
# Run Gradio frontend |
|
|
python app.py |
|
|
|