Spaces:
Sleeping
Sleeping
Claude Code - Backend Implementation Specialist
Add Docker deployment configuration for Hugging Face Spaces
36bfe21 metadata
title: RAG Chatbot
emoji: π€
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
Physical AI RAG Backend
FastAPI backend for the Physical AI textbook RAG chatbot.
Features
- RAG Pipeline: Retrieval-Augmented Generation using Cohere API
- Vector Search: Qdrant for semantic search
- Conversation Storage: Neon Postgres for chat history
- Text Selection Context: Support for querying with selected text
Tech Stack
- FastAPI (Python 3.11+)
- Cohere API (embeddings + generation)
- Qdrant Cloud (vector database)
- Neon Serverless Postgres (conversation storage)
Setup
1. Install Dependencies
cd backend
pip install -r requirements.txt
2. Configure Environment
Copy .env.example to .env and fill in your credentials:
cp .env.example .env
Required environment variables:
COHERE_API_KEY: Your Cohere API keyQDRANT_URL: Qdrant cluster URLQDRANT_API_KEY: Qdrant API keyNEON_DATABASE_URL: Neon Postgres connection stringFRONTEND_URL: Frontend URL for CORS
3. Setup Database
Run the schema on your Neon database:
psql $NEON_DATABASE_URL < app/db/schema.sql
4. Ingest Content
Parse MDX files and upload to Qdrant:
python scripts/ingest_content.py
This will:
- Parse all 11 chapters from
docs/chapters/ - Create ~80-100 semantic chunks
- Generate embeddings via Cohere
- Upload to Qdrant
5. Run Server
uvicorn app.main:app --reload --port 8000
API will be available at http://localhost:8000
API Endpoints
Chat
POST /api/chat/query
{
"query": "What is Physical AI?",
"conversation_id": "uuid-optional",
"filters": { "chapter": 1 }
}
POST /api/chat/query-with-context
{
"query": "Explain this",
"selected_text": "Physical AI systems...",
"selection_metadata": {
"chapter_title": "Introduction",
"url": "/docs/chapters/physical-ai-intro"
}
}
POST /api/chat/conversations Create a new conversation.
GET /api/chat/conversations/{id} Get conversation with all messages.
Health
GET /api/health Basic health check.
GET /api/health/detailed Detailed health check with database status.
Deployment
Railway (Recommended)
- Create Railway project
- Connect GitHub repo
- Set environment variables
- Deploy command:
uvicorn app.main:app --host 0.0.0.0 --port $PORT
Render
- Create new Web Service
- Connect GitHub repo
- Build command:
pip install -r requirements.txt - Start command:
uvicorn app.main:app --host 0.0.0.0 --port $PORT
Project Structure
backend/
βββ app/
β βββ main.py # FastAPI app
β βββ config.py # Settings
β βββ models/
β β βββ chat.py # Chat models
β β βββ document.py # Document models
β βββ services/
β β βββ embeddings.py # Cohere embeddings
β β βββ generation.py # Cohere generation
β β βββ retrieval.py # Qdrant search
β β βββ rag_pipeline.py # Main RAG logic
β βββ db/
β β βββ postgres.py # Neon client
β β βββ qdrant.py # Qdrant client
β β βββ schema.sql # Database schema
β βββ api/
β βββ routes/
β βββ chat.py # Chat endpoints
β βββ health.py # Health endpoints
βββ scripts/
β βββ ingest_content.py # Content ingestion
βββ requirements.txt
Development
Run with auto-reload:
uvicorn app.main:app --reload
View API docs:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
Cost Estimate
- Cohere: ~$5-10/month (moderate usage)
- Qdrant Cloud: Free (1GB tier)
- Neon Postgres: Free tier
- Railway: Free (500 hours/month)
Total: ~$5-10/month