Rag_Chatbot / README.md
Claude Code - Backend Implementation Specialist
Add Docker deployment configuration for Hugging Face Spaces
36bfe21
metadata
title: RAG Chatbot
emoji: πŸ€–
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false

Physical AI RAG Backend

FastAPI backend for the Physical AI textbook RAG chatbot.

Features

  • RAG Pipeline: Retrieval-Augmented Generation using Cohere API
  • Vector Search: Qdrant for semantic search
  • Conversation Storage: Neon Postgres for chat history
  • Text Selection Context: Support for querying with selected text

Tech Stack

  • FastAPI (Python 3.11+)
  • Cohere API (embeddings + generation)
  • Qdrant Cloud (vector database)
  • Neon Serverless Postgres (conversation storage)

Setup

1. Install Dependencies

cd backend
pip install -r requirements.txt

2. Configure Environment

Copy .env.example to .env and fill in your credentials:

cp .env.example .env

Required environment variables:

  • COHERE_API_KEY: Your Cohere API key
  • QDRANT_URL: Qdrant cluster URL
  • QDRANT_API_KEY: Qdrant API key
  • NEON_DATABASE_URL: Neon Postgres connection string
  • FRONTEND_URL: Frontend URL for CORS

3. Setup Database

Run the schema on your Neon database:

psql $NEON_DATABASE_URL < app/db/schema.sql

4. Ingest Content

Parse MDX files and upload to Qdrant:

python scripts/ingest_content.py

This will:

  • Parse all 11 chapters from docs/chapters/
  • Create ~80-100 semantic chunks
  • Generate embeddings via Cohere
  • Upload to Qdrant

5. Run Server

uvicorn app.main:app --reload --port 8000

API will be available at http://localhost:8000

API Endpoints

Chat

POST /api/chat/query

{
  "query": "What is Physical AI?",
  "conversation_id": "uuid-optional",
  "filters": { "chapter": 1 }
}

POST /api/chat/query-with-context

{
  "query": "Explain this",
  "selected_text": "Physical AI systems...",
  "selection_metadata": {
    "chapter_title": "Introduction",
    "url": "/docs/chapters/physical-ai-intro"
  }
}

POST /api/chat/conversations Create a new conversation.

GET /api/chat/conversations/{id} Get conversation with all messages.

Health

GET /api/health Basic health check.

GET /api/health/detailed Detailed health check with database status.

Deployment

Railway (Recommended)

  1. Create Railway project
  2. Connect GitHub repo
  3. Set environment variables
  4. Deploy command: uvicorn app.main:app --host 0.0.0.0 --port $PORT

Render

  1. Create new Web Service
  2. Connect GitHub repo
  3. Build command: pip install -r requirements.txt
  4. Start command: uvicorn app.main:app --host 0.0.0.0 --port $PORT

Project Structure

backend/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ main.py              # FastAPI app
β”‚   β”œβ”€β”€ config.py            # Settings
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ chat.py         # Chat models
β”‚   β”‚   └── document.py     # Document models
β”‚   β”œβ”€β”€ services/
β”‚   β”‚   β”œβ”€β”€ embeddings.py   # Cohere embeddings
β”‚   β”‚   β”œβ”€β”€ generation.py   # Cohere generation
β”‚   β”‚   β”œβ”€β”€ retrieval.py    # Qdrant search
β”‚   β”‚   └── rag_pipeline.py # Main RAG logic
β”‚   β”œβ”€β”€ db/
β”‚   β”‚   β”œβ”€β”€ postgres.py     # Neon client
β”‚   β”‚   β”œβ”€β”€ qdrant.py       # Qdrant client
β”‚   β”‚   └── schema.sql      # Database schema
β”‚   └── api/
β”‚       └── routes/
β”‚           β”œβ”€β”€ chat.py     # Chat endpoints
β”‚           └── health.py   # Health endpoints
β”œβ”€β”€ scripts/
β”‚   └── ingest_content.py   # Content ingestion
└── requirements.txt

Development

Run with auto-reload:

uvicorn app.main:app --reload

View API docs:

Cost Estimate

  • Cohere: ~$5-10/month (moderate usage)
  • Qdrant Cloud: Free (1GB tier)
  • Neon Postgres: Free tier
  • Railway: Free (500 hours/month)

Total: ~$5-10/month