File size: 3,951 Bytes
9d096d7
36bfe21
 
 
 
9d096d7
36bfe21
9d096d7
 
 
36bfe21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
---
title: RAG Chatbot
emoji: πŸ€–
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
---

# Physical AI RAG Backend

FastAPI backend for the Physical AI textbook RAG chatbot.

## Features

- **RAG Pipeline**: Retrieval-Augmented Generation using Cohere API
- **Vector Search**: Qdrant for semantic search
- **Conversation Storage**: Neon Postgres for chat history
- **Text Selection Context**: Support for querying with selected text

## Tech Stack

- FastAPI (Python 3.11+)
- Cohere API (embeddings + generation)
- Qdrant Cloud (vector database)
- Neon Serverless Postgres (conversation storage)

## Setup

### 1. Install Dependencies

```bash
cd backend
pip install -r requirements.txt
```

### 2. Configure Environment

Copy `.env.example` to `.env` and fill in your credentials:

```bash
cp .env.example .env
```

Required environment variables:
- `COHERE_API_KEY`: Your Cohere API key
- `QDRANT_URL`: Qdrant cluster URL
- `QDRANT_API_KEY`: Qdrant API key
- `NEON_DATABASE_URL`: Neon Postgres connection string
- `FRONTEND_URL`: Frontend URL for CORS

### 3. Setup Database

Run the schema on your Neon database:

```bash
psql $NEON_DATABASE_URL < app/db/schema.sql
```

### 4. Ingest Content

Parse MDX files and upload to Qdrant:

```bash
python scripts/ingest_content.py
```

This will:
- Parse all 11 chapters from `docs/chapters/`
- Create ~80-100 semantic chunks
- Generate embeddings via Cohere
- Upload to Qdrant

### 5. Run Server

```bash
uvicorn app.main:app --reload --port 8000
```

API will be available at `http://localhost:8000`

## API Endpoints

### Chat

**POST /api/chat/query**
```json
{
  "query": "What is Physical AI?",
  "conversation_id": "uuid-optional",
  "filters": { "chapter": 1 }
}
```

**POST /api/chat/query-with-context**
```json
{
  "query": "Explain this",
  "selected_text": "Physical AI systems...",
  "selection_metadata": {
    "chapter_title": "Introduction",
    "url": "/docs/chapters/physical-ai-intro"
  }
}
```

**POST /api/chat/conversations**
Create a new conversation.

**GET /api/chat/conversations/{id}**
Get conversation with all messages.

### Health

**GET /api/health**
Basic health check.

**GET /api/health/detailed**
Detailed health check with database status.

## Deployment

### Railway (Recommended)

1. Create Railway project
2. Connect GitHub repo
3. Set environment variables
4. Deploy command: `uvicorn app.main:app --host 0.0.0.0 --port $PORT`

### Render

1. Create new Web Service
2. Connect GitHub repo
3. Build command: `pip install -r requirements.txt`
4. Start command: `uvicorn app.main:app --host 0.0.0.0 --port $PORT`

## Project Structure

```
backend/
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ main.py              # FastAPI app
β”‚   β”œβ”€β”€ config.py            # Settings
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ chat.py         # Chat models
β”‚   β”‚   └── document.py     # Document models
β”‚   β”œβ”€β”€ services/
β”‚   β”‚   β”œβ”€β”€ embeddings.py   # Cohere embeddings
β”‚   β”‚   β”œβ”€β”€ generation.py   # Cohere generation
β”‚   β”‚   β”œβ”€β”€ retrieval.py    # Qdrant search
β”‚   β”‚   └── rag_pipeline.py # Main RAG logic
β”‚   β”œβ”€β”€ db/
β”‚   β”‚   β”œβ”€β”€ postgres.py     # Neon client
β”‚   β”‚   β”œβ”€β”€ qdrant.py       # Qdrant client
β”‚   β”‚   └── schema.sql      # Database schema
β”‚   └── api/
β”‚       └── routes/
β”‚           β”œβ”€β”€ chat.py     # Chat endpoints
β”‚           └── health.py   # Health endpoints
β”œβ”€β”€ scripts/
β”‚   └── ingest_content.py   # Content ingestion
└── requirements.txt
```

## Development

Run with auto-reload:
```bash
uvicorn app.main:app --reload
```

View API docs:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc

## Cost Estimate

- Cohere: ~$5-10/month (moderate usage)
- Qdrant Cloud: Free (1GB tier)
- Neon Postgres: Free tier
- Railway: Free (500 hours/month)

**Total: ~$5-10/month**