Anshul Prasad
error handling.
be65f0f
---
title: Ask the Guru
emoji: 🧘
colorFrom: yellow
colorTo: blue
sdk: docker
app_port: 7860
---
# RAG Q&A Assistant
A retrieval-augmented question-answering (RAG) system built on curated YouTube subtitle transcripts.
The project provides:
- A FastAPI backend (`/ask`) for question answering.
- A static frontend served by FastAPI.
- A data pipeline to download subtitles, preprocess text, embed transcripts, and retrieve relevant context.
- A CLI flow for local/offline querying.
## Table of Contents
- [Architecture](#architecture)
- [Project Structure](#project-structure)
- [Tech Stack](#tech-stack)
- [Prerequisites](#prerequisites)
- [Configuration](#configuration)
- [Quick Start](#quick-start)
- [Run with Docker](#run-with-docker)
- [API Reference](#api-reference)
- [Data Pipeline](#data-pipeline)
- [Deployment](#deployment)
- [Operational Notes](#operational-notes)
- [Troubleshooting](#troubleshooting)
## Architecture
1. User asks a question from the UI or directly through `POST /ask`.
2. Query is embedded using `all-MiniLM-L6-v2`.
3. Top-K transcript chunks are retrieved from the FAISS index.
4. Retrieved context is token-trimmed (`MAX_CONTEXT_TOKENS`).
5. Groq chat completion API generates the final answer using a domain-aligned system prompt.
Core runtime flow:
- `app.py` loads `data/file_paths.pkl` and `data/transcripts.pkl` at startup.
- `api/retrieve_context.py` handles vector retrieval.
- `api/generate_response.py` handles LLM generation.
- `frontend/index.html` is mounted and served from `/`.
## Project Structure
```text
.
β”œβ”€β”€ api/
β”‚ β”œβ”€β”€ embed_transcripts.py
β”‚ β”œβ”€β”€ generate_response.py
β”‚ └── retrieve_context.py
β”œβ”€β”€ data/
β”‚ β”œβ”€β”€ subtitles_vtt/
β”‚ β”œβ”€β”€ transcripts_txt/
β”‚ β”œβ”€β”€ file_paths.pkl
β”‚ β”œβ”€β”€ transcript_index.faiss
β”‚ └── transcripts.pkl
β”œβ”€β”€ frontend/
β”‚ β”œβ”€β”€ assets/images/
β”‚ └── index.html
β”œβ”€β”€ outputs/
β”‚ β”œβ”€β”€ generated_response.txt
β”‚ └── retrieved_transcripts.txt
β”œβ”€β”€ utils/
β”‚ β”œβ”€β”€ download_vtt.py
β”‚ β”œβ”€β”€ preprocess.py
β”‚ β”œβ”€β”€ token.py
β”‚ └── vtt_to_txt.py
β”œβ”€β”€ app.py
β”œβ”€β”€ config.py
β”œβ”€β”€ main.py
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ pyproject.toml
β”œβ”€β”€ requirements.txt
└── uv.lock
```
## Tech Stack
- Python 3.11+ (project metadata), FastAPI, Uvicorn
- FAISS (`faiss-cpu`) for vector search
- Sentence Transformers (`all-MiniLM-L6-v2`) for embeddings
- Groq API for response generation (`llama-3.1-8b-instant`)
- Static HTML/CSS/JS frontend
## Prerequisites
- Python 3.11 or later
- `pip` or `uv`
- `yt-dlp` (required only when running subtitle download stage)
- A valid `GROQ_API_KEY`
## Configuration
Environment variables read by the app:
- `GROQ_API_KEY`: required for answer generation
- `GITHUB_TOKEN`: optional; present in config but not required for runtime flow
- `HF_API_TOKEN`: optional; present in config but not required for runtime flow
Important runtime paths are defined in `config.py`, including:
- `data/file_paths.pkl`
- `data/transcripts.pkl`
- `data/transcript_index.faiss`
- `outputs/generated_response.txt`
## Quick Start
### 1. Install dependencies
Using `uv`:
```bash
uv sync
```
Using `pip`:
```bash
python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
```
### 2. Set environment variable
```bash
export GROQ_API_KEY="your_groq_api_key"
```
### 3. Start API + frontend
```bash
uvicorn app:app --host 0.0.0.0 --port 7860 --reload
```
Open `http://localhost:7860`.
## Run with Docker
Build:
```bash
docker build -t rag-qa-assistant .
```
Run:
```bash
docker run --rm -p 7860:7860 -e GROQ_API_KEY="your_groq_api_key" rag-qa-assistant
```
## API Reference
### `POST /ask`
Request body:
```json
{
"query": "How do I deal with fear?"
}
```
Success response (`200`):
```json
{
"answer": "..."
}
```
Error responses:
- `400`: missing or empty `query`
- `404`: no relevant transcripts retrieved
- `500`: internal error
Example:
```bash
curl -X POST "http://localhost:7860/ask" \
-H "Content-Type: application/json" \
-d '{"query": "What is desire?"}'
```
## Data Pipeline
`main.py` includes stages for data preparation and querying.
Pipeline stages:
1. Download subtitles from configured channels (`utils/download_vtt.py`)
2. Convert `.vtt` to cleaned `.txt` (`utils/vtt_to_txt.py`, `utils/preprocess.py`)
3. Load and persist transcript corpus (`data/*.pkl`)
4. Create FAISS index (`api/embed_transcripts.py`)
5. Retrieve context + generate response
Current state of `main.py`:
- Download/preprocess/embed stages are present but commented out in `main()`.
- Default execution expects prebuilt artifacts in `data/`.
Run CLI query flow:
```bash
python main.py
```
## Deployment
This repository is configured for Hugging Face Spaces (Docker SDK):
- README front matter defines Space metadata.
- `.github/workflows/main.yml` syncs `main` branch to HF Space.
- `.github/workflows/space-keepalive.yml` pings the deployed Space every 12 hours.
## Operational Notes
- Data artifacts are currently committed to the repository (`data/*.pkl`, `.faiss`).
- CORS in `app.py` is permissive (`allow_origins=["*"]`) and suitable for dev/demo, not strict production hardening.
- `frontend/index.html` references `assets/images/hero-background.jpg`, but this file is not present in `frontend/assets/images/`.
- `api/embed_transcripts.py` currently treats `transcript_index` as a directory path (`mkdir`) though it is configured as a file path; this affects index regeneration workflows.
## Troubleshooting
- `Error: AI client not configured.`
- Ensure `GROQ_API_KEY` is set in the shell/container before startup.
- `No relevant transcripts found` (`404` from `/ask`)
- Check that `data/transcript_index.faiss`, `data/file_paths.pkl`, and `data/transcripts.pkl` exist and are compatible.
- API starts but UI looks incomplete
- Verify static assets under `frontend/assets/images/`.
- Subtitle download stage fails
- Install `yt-dlp` and verify network access and YouTube rate limits.