pmmdot commited on
Commit
334c19f
·
1 Parent(s): 0a2b63a

certifi and chunker v2

Files changed (5) hide show
  1. README.md +328 -324
  2. pyproject.toml +1 -0
  3. src/database.py +5 -2
  4. src/rag.py +31 -0
  5. uv.lock +2 -0
README.md CHANGED
@@ -15,6 +15,8 @@ Production-grade retrieval-augmented generation service for document-based quest
15
  **Base URL**: `https://pmmdot-askbookie.hf.space`
16
  **Interactive Documentation**: `/docs` (Swagger UI) | `/redoc` (ReDoc)
17
 
 
 
18
  ## Table of Contents
19
 
20
  1. [Authentication](#authentication)
@@ -24,18 +26,12 @@ Production-grade retrieval-augmented generation service for document-based quest
24
  - [POST /upload](#post-upload)
25
  - [GET /jobs/{job_id}](#get-jobsjob_id)
26
  - [GET /jobs](#get-jobs)
27
- 4. [System Endpoints](#system-endpoints)
28
- - [GET /health](#get-health)
29
- - [GET /](#get-)
30
- 5. [Admin Endpoints](#admin-endpoints)
31
- - [GET /history](#get-history)
32
- - [GET /admin/keys](#get-adminkeys)
33
- - [POST /admin/keys/{key_id}/enable](#post-adminkeyskeyidenable)
34
- - [POST /admin/keys/{key_id}/disable](#post-adminkeyskeyiddisable)
35
- - [GET /admin/models/current](#get-adminmodelscurrent)
36
- - [POST /admin/models/switch](#post-adminmodelsswitch)
37
- 6. [Error Handling](#error-handling)
38
 
 
39
 
40
  ## Technical Stack
41
 
@@ -59,9 +55,11 @@ Production-grade retrieval-augmented generation service for document-based quest
59
  | 4 | GPT-4o-mini | DuckDuckGo (Free) |
60
  | 5 | Claude-3-Haiku | DuckDuckGo (Free) |
61
 
 
 
62
  ## Authentication
63
 
64
- All endpoints except `/health` and `/` require HMAC-SHA256 request signing. Authentication operates on a rotating key infrastructure with 90-day expiration cycles.
65
 
66
  ### Required Headers
67
 
@@ -78,39 +76,20 @@ The signature message follows the format:
78
  {timestamp}\n{HTTP_METHOD}\n{path}
79
  ```
80
 
81
- **Python Implementation**:
82
- ```python
83
- import hmac
84
- import hashlib
85
- import time
86
-
87
- def generate_auth_headers(method: str, path: str, key_id: str, secret: str) -> dict:
88
- timestamp = str(int(time.time()))
89
- message = f"{timestamp}\n{method.upper()}\n{path}"
90
- signature = hmac.new(
91
- secret.encode(),
92
- message.encode(),
93
- hashlib.sha256
94
- ).hexdigest()
95
-
96
- return {
97
- "X-API-Key-Id": key_id,
98
- "X-API-Timestamp": timestamp,
99
- "X-API-Signature": signature
100
- }
101
- ```
102
-
103
  **JavaScript Implementation**:
104
  ```javascript
105
- const crypto = require('crypto');
106
-
107
- function generateAuthHeaders(method, path, keyId, secret) {
108
  const timestamp = Math.floor(Date.now() / 1000).toString();
109
  const message = `${timestamp}\n${method.toUpperCase()}\n${path}`;
110
- const signature = crypto
111
- .createHmac('sha256', secret)
112
- .update(message)
113
- .digest('hex');
 
 
 
 
 
114
 
115
  return {
116
  'X-API-Key-Id': keyId,
@@ -122,419 +101,444 @@ function generateAuthHeaders(method, path, keyId, secret) {
122
 
123
  ### Security Constraints
124
 
125
- - Timestamp tolerance: 300 seconds (5 minutes)
126
- - Failed authentication lockout: 5 attempts per IP (5-minute window)
127
- - Constant-time signature comparison to prevent timing attacks
128
-
129
 
 
130
 
131
  ## Rate Limits
132
 
133
- Rate limiting operates on a sliding window of 60 seconds per API key.
134
-
135
  | Endpoint | Limit | Window |
136
  |----------|-------|--------|
137
  | `/ask` | 30 requests | 60 seconds |
138
  | `/upload` | 2 requests | 60 seconds |
139
  | All other endpoints | 50 requests | 60 seconds |
140
- | Failed auth attempts | 5 per IP | 5 minutes (lockout) |
141
 
142
- When rate limited, responses include the `Retry-After` header:
143
- ```http
144
- HTTP/1.1 429 Too Many Requests
145
- Retry-After: 60
146
- Content-Type: application/json
147
 
148
- {"detail": "Rate limit exceeded"}
149
- ```
150
  ## Core Endpoints
151
 
152
  ### POST /ask
153
 
154
- Query the pre-vectorised document corpus. The system performs semantic retrieval against the specified subject partition, then synthesises a response using the active language model.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
155
 
156
- **Request**:
157
  ```http
158
  POST /ask HTTP/1.1
159
  Content-Type: application/json
160
- X-API-Key-Id: your-key-id
161
- X-API-Timestamp: 1705234567
162
- X-API-Signature: a1b2c3d4...
163
 
164
  {
165
  "query": "What are the different types of ecosystems?",
166
  "subject": "evs",
 
167
  "context_limit": 5
168
  }
169
  ```
170
 
171
- | Field | Type | Required | Constraints | Description |
172
- |-------|------|----------|-------------|-------------|
173
- | `query` | string | Yes | 1-1000 chars | Natural language question |
174
- | `subject` | string | Yes | 1-100 chars, alphanumeric with `_-` | Document collection identifier |
175
- | `context_limit` | integer | No | 1-20, default 5 | Number of context chunks for retrieval |
 
 
 
 
 
 
 
 
176
 
177
- **Response** (200 OK):
178
  ```json
179
  {
180
- "answer": "Ecosystems are classified into two primary categories: terrestrial and aquatic. Terrestrial ecosystems include forests, grasslands, deserts, and tundra. Aquatic ecosystems are subdivided into freshwater (lakes, rivers, wetlands) and marine (oceans, coral reefs, estuaries).",
181
  "sources": [
182
- {
183
- "page": 12,
184
- "content": "Ecosystems can be broadly categorized into terrestrial and aquatic types...",
185
- "filename": "evs_chapter3.pdf"
186
- },
187
- {
188
- "page": 15,
189
- "content": "Marine ecosystems cover approximately 71% of Earth's surface...",
190
- "filename": "evs_chapter3.pdf"
191
- }
192
  ],
 
193
  "request_id": "a1b2c3d4e5f6g7h8"
194
  }
195
  ```
 
 
 
 
 
 
 
 
196
  ---
 
197
  ### POST /upload
198
 
199
- Ingest a PDF document into the vector index. Documents are validated, chunked by page boundaries, embedded using the HuggingFace `gte-modernbert-base` model, and stored in the specified subject partition. Processing occurs asynchronously.
 
 
200
 
201
- **Request**:
202
  ```http
203
  POST /upload HTTP/1.1
204
  Content-Type: multipart/form-data
205
- X-API-Key-Id: your-key-id
206
- X-API-Timestamp: 1705234567
207
- X-API-Signature: a1b2c3d4...
208
 
209
  file: [binary PDF data]
210
- subject: physics
211
  ```
212
 
213
- | Field | Type | Required | Constraints | Description |
214
- |-------|------|----------|-------------|-------------|
215
- | `file` | binary | Yes | PDF, max 10MB, must start with `%PDF` magic bytes | Document to ingest |
216
- | `subject` | string | Yes | 1-100 chars, alphanumeric with `_-` | Target collection identifier |
 
217
 
218
- **Response** (200 OK):
219
  ```json
220
  {
221
  "job_id": "a1b2c3d4e5f6g7h8i9j0k1l2",
222
  "status": "queued",
223
- "filename": "thermodynamics_notes.pdf",
224
- "subject": "physics",
225
- "size": 2457600
226
  }
227
  ```
228
 
229
- **Validation Pipeline**:
230
- 1. MIME type verification (`application/pdf`)
231
- 2. Magic byte validation (must start with `%PDF`)
232
- 3. Size constraint (max 10MB)
233
- 4. Concurrent upload limit (max 3 per key)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
234
  ---
235
 
236
  ### GET /jobs/{job_id}
237
 
238
- Retrieve the status of a PDF processing job. Jobs transition through the following states: `queued` → `processing` → `done` | `failed`.
239
 
240
- **Request**:
241
- ```http
242
- GET /jobs/a1b2c3d4e5f6g7h8i9j0k1l2 HTTP/1.1
243
- X-API-Key-Id: your-key-id
244
- X-API-Timestamp: 1705234567
245
- X-API-Signature: a1b2c3d4...
246
- ```
247
-
248
- **Response** (200 OK):
249
  ```json
250
  {
251
  "job_id": "a1b2c3d4e5f6g7h8i9j0k1l2",
252
- "key_id": "your-key-id",
253
- "filename": "thermodynamics_notes.pdf",
254
- "subject": "physics",
255
- "size": 2457600,
256
  "status": "done",
257
- "error": null,
258
- "created_at": 1705234567.0,
259
- "updated_at": 1705234890.0
260
  }
261
  ```
262
 
263
- | Status | Description |
264
- |--------|-------------|
265
- | `queued` | Job accepted, awaiting processing |
266
- | `processing` | Document being chunked and embedded |
267
- | `done` | Successfully ingested into vector store |
268
- | `failed` | Processing error (check `error` field) |
269
  ---
270
 
271
  ### GET /jobs
272
 
273
- List all jobs associated with the authenticated API key. Results are ordered by creation time (descending).
274
 
275
- **Request**:
276
- ```http
277
- GET /jobs HTTP/1.1
278
- X-API-Key-Id: your-key-id
279
- X-API-Timestamp: 1705234567
280
- X-API-Signature: a1b2c3d4...
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
281
  ```
282
 
283
- **Response** (200 OK):
284
- ```json
285
- {
286
- "jobs": [
287
- {
288
- "job_id": "a1b2c3d4e5f6g7h8i9j0k1l2",
289
- "filename": "thermodynamics_notes.pdf",
290
- "subject": "physics",
291
- "size": 2457600,
292
- "status": "done",
293
- "error": null,
294
- "created_at": 1705234567.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
295
  },
296
- {
297
- "job_id": "m1n2o3p4q5r6s7t8u9v0w1x2",
298
- "filename": "organic_chemistry.pdf",
299
- "subject": "chemistry",
300
- "size": 1843200,
301
- "status": "processing",
302
- "error": null,
303
- "created_at": 1705234500.0
304
- }
305
- ],
306
- "total": 2
307
  }
308
  ```
309
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
310
  ## System Endpoints
311
 
312
  ### GET /health
313
 
314
- Service health check endpoint. Returns operational metrics and current model configuration. No authentication required.
315
 
316
- **Response** (200 OK):
317
  ```json
318
  {
319
  "status": "healthy",
320
  "uptime_hours": 48.5,
321
- "total_api_calls": 15420,
322
- "total_questions": 12350,
323
- "active_jobs": 2,
324
- "memory_mb": 1024.5,
325
  "current_model": {
326
- "id": 1,
327
  "name": "Gemini-3-flash",
328
  "description": "Gemini Primary API Key"
329
- },
330
- "per_user": {
331
- "askbookie-pesu": {
332
- "api_calls": 8500,
333
- "questions_asked": 7200,
334
- "uploads_attempted": 45,
335
- "success_rate": 98.5,
336
- "average_latency_seconds": 1.25,
337
- "ask_fails": 12,
338
- "upload_fails": 3,
339
- "role": "user"
340
- }
341
  }
342
  }
343
  ```
344
- ---
345
- ### GET /
346
 
347
- Dashboard endpoint. Returns the service dashboard HTML if available, otherwise returns service metadata.
348
 
349
- **Response** (200 OK):
350
- ```json
351
- {
352
- "service": "AskBookie RAG API",
353
- "version": "1.0.0"
354
- }
355
- ```
356
 
 
357
 
358
  ## Admin Endpoints
359
 
360
- ### GET /history
361
-
362
- Retrieve paginated query history across all users. Useful for analytics and audit trails.
363
 
364
- **Request**:
365
- ```http
366
- GET /history?limit=50&offset=0 HTTP/1.1
367
- X-API-Key-Id: admin
368
- X-API-Timestamp: 1705234567
369
- X-API-Signature: a1b2c3d4...
370
- ```
371
 
372
- | Parameter | Type | Default | Description |
373
- |-----------|------|---------|-------------|
374
- | `limit` | integer | 100 | Number of records to return |
375
- | `offset` | integer | 0 | Pagination offset |
376
 
377
- **Response** (200 OK):
378
- ```json
379
- {
380
- "history": [
381
- {
382
- "id": 1,
383
- "key_id": "askbookie-pesu",
384
- "subject": "physics",
385
- "query": "What is the first law of thermodynamics?",
386
- "answer": "The first law of thermodynamics states that energy cannot be created or destroyed...",
387
- "sources": [...],
388
- "request_id": "a1b2c3d4e5f6g7h8",
389
- "latency_ms": 1250.5,
390
- "timestamp": 1705234567.0
391
- }
392
- ],
393
- "total": 12350,
394
- "limit": 50,
395
- "offset": 0
396
- }
397
- ```
398
- ---
399
  ### GET /admin/keys
400
 
401
- List all configured API keys with their metadata.
402
 
403
- **Response** (200 OK):
404
- ```json
405
- {
406
- "keys": [
407
- {
408
- "key_id": "askbookie-pesu",
409
- "role": "user",
410
- "active": true,
411
- "expires_at": "2025-04-14T00:00:00+00:00"
412
- },
413
- {
414
- "key_id": "admin",
415
- "role": "admin",
416
- "active": true,
417
- "expires_at": null
418
- }
419
- ]
420
- }
421
- ```
422
- ---
423
  ### POST /admin/keys/{key_id}/enable
424
 
425
- Re-enable a disabled API key.
426
 
427
- **Response** (200 OK):
428
- ```json
429
- {
430
- "status": "enabled",
431
- "key_id": "askbookie-pesu"
432
- }
433
- ```
434
- ---
435
  ### POST /admin/keys/{key_id}/disable
436
 
437
- Disable an API key. Disabled keys cannot authenticate requests. The admin key cannot be disabled.
438
 
439
- **Response** (200 OK):
440
- ```json
441
- {
442
- "status": "disabled",
443
- "key_id": "askbookie-pesu"
444
- }
445
- ```
446
- ---
447
  ### GET /admin/models/current
448
 
449
- Retrieve the currently active language model configuration.
 
 
450
 
451
- **Response** (200 OK):
452
  ```json
453
- {
454
- "id": 1,
455
- "name": "Gemini-3-flash",
456
- "description": "Gemini Primary API Key"
457
- }
458
  ```
459
- ---
460
- ### POST /admin/models/switch
461
 
462
- Switch the active language model. The system supports multiple model backends for failover and experimentation.
463
 
464
- **Request**:
465
- ```http
466
- POST /admin/models/switch HTTP/1.1
467
- Content-Type: application/json
468
- X-API-Key-Id: admin
469
- X-API-Timestamp: 1705234567
470
- X-API-Signature: a1b2c3d4...
471
 
472
- {
473
- "model_id": 2
474
- }
 
 
475
  ```
476
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
477
 
478
- **Response** (200 OK):
 
 
479
  ```json
480
  {
481
- "status": "success",
482
- "message": "Switched to model 2",
483
- "model": {
484
- "id": 2,
485
- "name": "Gemini-3-flash(Back-up)",
486
- "description": "Gemini Secondary API Key"
487
- }
488
  }
489
  ```
490
- ---
491
- ## Error Handling
492
-
493
- All errors return JSON responses with consistent structure:
494
 
 
495
  ```json
496
  {
497
- "detail": "Error description"
 
498
  }
499
  ```
500
- ### HTTP Status Codes
501
-
502
- | Code | Meaning |
503
- |------|---------|
504
- | 400 | Bad Request - Invalid parameters, malformed JSON, unsupported file type |
505
- | 401 | Unauthorized - Invalid or expired signature, missing auth headers |
506
- | 403 | Forbidden - Insufficient permissions for admin endpoints |
507
- | 404 | Not Found - Job or resource does not exist |
508
- | 413 | Payload Too Large - File exceeds 10MB or JSON exceeds 16KB |
509
- | 429 | Too Many Requests - Rate limit exceeded, auth lockout, or LLM quota exhausted |
510
- | 500 | Internal Server Error - RAG pipeline failure |
511
-
512
 
513
- ---
514
- ### cURL Examples
515
-
516
- ```bash
517
- generate_sig() {
518
- local method=$1 path=$2 secret=$3
519
- local ts=$(date +%s)
520
- local msg="${ts}"$'\n'"${method}"$'\n'"${path}"
521
- local sig=$(echo -n "$msg" | openssl dgst -sha256 -hmac "$secret" | cut -d' ' -f2)
522
- echo "-H 'X-API-Key-Id: $KEY_ID' -H 'X-API-Timestamp: $ts' -H 'X-API-Signature: $sig'"
523
  }
524
-
525
- curl -X POST https://pmmdot-askbookie.hf.space/ask \
526
- -H "Content-Type: application/json" \
527
- -H "X-API-Key-Id: your-key-id" \
528
- -H "X-API-Timestamp: $(date +%s)" \
529
- -H "X-API-Signature: <computed>" \
530
- -d '{"query": "What is thermodynamics?", "subject": "physics"}'
531
-
532
- curl -X POST https://pmmdot-askbookie.hf.space/upload \
533
- -H "X-API-Key-Id: your-key-id" \
534
- -H "X-API-Timestamp: $(date +%s)" \
535
- -H "X-API-Signature: <computed>" \
536
- -F "file=@document.pdf" \
537
- -F "subject=physics"
538
-
539
- curl https://pmmdot-askbookie.hf.space/health
540
  ```
 
15
  **Base URL**: `https://pmmdot-askbookie.hf.space`
16
  **Interactive Documentation**: `/docs` (Swagger UI) | `/redoc` (ReDoc)
17
 
18
+ ---
19
+
20
  ## Table of Contents
21
 
22
  1. [Authentication](#authentication)
 
26
  - [POST /upload](#post-upload)
27
  - [GET /jobs/{job_id}](#get-jobsjob_id)
28
  - [GET /jobs](#get-jobs)
29
+ 4. [Frontend Integration Guide](#frontend-integration-guide)
30
+ 5. [System Endpoints](#system-endpoints)
31
+ 6. [Admin Endpoints](#admin-endpoints)
32
+ 7. [Error Handling](#error-handling)
 
 
 
 
 
 
 
33
 
34
+ ---
35
 
36
  ## Technical Stack
37
 
 
55
  | 4 | GPT-4o-mini | DuckDuckGo (Free) |
56
  | 5 | Claude-3-Haiku | DuckDuckGo (Free) |
57
 
58
+ ---
59
+
60
  ## Authentication
61
 
62
+ All endpoints except `/health` and `/` require HMAC-SHA256 request signing.
63
 
64
  ### Required Headers
65
 
 
76
  {timestamp}\n{HTTP_METHOD}\n{path}
77
  ```
78
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79
  **JavaScript Implementation**:
80
  ```javascript
81
+ async function generateAuthHeaders(method, path, keyId, secret) {
 
 
82
  const timestamp = Math.floor(Date.now() / 1000).toString();
83
  const message = `${timestamp}\n${method.toUpperCase()}\n${path}`;
84
+
85
+ const encoder = new TextEncoder();
86
+ const key = await crypto.subtle.importKey(
87
+ 'raw', encoder.encode(secret),
88
+ { name: 'HMAC', hash: 'SHA-256' }, false, ['sign']
89
+ );
90
+ const sig = await crypto.subtle.sign('HMAC', key, encoder.encode(message));
91
+ const signature = Array.from(new Uint8Array(sig))
92
+ .map(b => b.toString(16).padStart(2, '0')).join('');
93
 
94
  return {
95
  'X-API-Key-Id': keyId,
 
101
 
102
  ### Security Constraints
103
 
104
+ - Timestamp tolerance: **300 seconds** (5 minutes)
105
+ - Failed auth lockout: **5 attempts per IP** (5-minute window)
106
+ - Constant-time signature comparison (timing attack prevention)
 
107
 
108
+ ---
109
 
110
  ## Rate Limits
111
 
 
 
112
  | Endpoint | Limit | Window |
113
  |----------|-------|--------|
114
  | `/ask` | 30 requests | 60 seconds |
115
  | `/upload` | 2 requests | 60 seconds |
116
  | All other endpoints | 50 requests | 60 seconds |
 
117
 
118
+ When rate limited, responses include `Retry-After: 60` header.
119
+
120
+ ---
 
 
121
 
 
 
122
  ## Core Endpoints
123
 
124
  ### POST /ask
125
 
126
+ Query documents using semantic retrieval + LLM synthesis.
127
+
128
+ > [!IMPORTANT]
129
+ > **Two query modes exist:**
130
+ > 1. **Standard Mode**: Query pre-indexed university materials using `subject` + `unit`
131
+ > 2. **Custom Upload Mode**: Query user-uploaded PDFs using `cluster` (returned from `/upload`)
132
+
133
+ #### Request Schema
134
+
135
+ | Field | Type | Required | Constraints | Description |
136
+ |-------|------|----------|-------------|-------------|
137
+ | `query` | string | **Yes** | 1-1000 chars | Natural language question |
138
+ | `subject` | string | Conditional | 1-100 chars, alphanumeric + `_-` | Subject collection (e.g., `evs`, `physics`) |
139
+ | `unit` | integer | Conditional | 1-4 | Unit number within the subject |
140
+ | `cluster` | string | Conditional | max 100 chars | Temp cluster from `/upload` response |
141
+ | `context_limit` | integer | No | 1-20, default 5 | Number of context chunks |
142
+
143
+ > [!WARNING]
144
+ > **Mutual Exclusivity**: Either provide `cluster` OR provide BOTH `subject` AND `unit`. Never mix them.
145
+
146
+ #### Example 1: Standard Mode (Pre-indexed Materials)
147
 
 
148
  ```http
149
  POST /ask HTTP/1.1
150
  Content-Type: application/json
 
 
 
151
 
152
  {
153
  "query": "What are the different types of ecosystems?",
154
  "subject": "evs",
155
+ "unit": 2,
156
  "context_limit": 5
157
  }
158
  ```
159
 
160
+ #### Example 2: Custom Upload Mode (User PDF)
161
+
162
+ ```http
163
+ POST /ask HTTP/1.1
164
+ Content-Type: application/json
165
+
166
+ {
167
+ "query": "Summarize the main findings",
168
+ "cluster": "temp_a1b2c3d4e5f6g7h8i9j0k1l2"
169
+ }
170
+ ```
171
+
172
+ #### Response (200 OK)
173
 
 
174
  ```json
175
  {
176
+ "answer": "Ecosystems are classified into terrestrial and aquatic...",
177
  "sources": [
178
+ "evs_chapter3.pdf: Slide 12",
179
+ "evs_chapter3.pdf: Slide 15"
 
 
 
 
 
 
 
 
180
  ],
181
+ "collection": "askbookie_evs_unit-2",
182
  "request_id": "a1b2c3d4e5f6g7h8"
183
  }
184
  ```
185
+
186
+ | Field | Description |
187
+ |-------|-------------|
188
+ | `answer` | LLM-generated response (Markdown formatted, LaTeX supported) |
189
+ | `sources` | List of source references: `"filename: Slide N"` |
190
+ | `collection` | The Qdrant collection queried |
191
+ | `request_id` | Unique identifier for debugging |
192
+
193
  ---
194
+
195
  ### POST /upload
196
 
197
+ Upload a PDF document for custom RAG queries. Processing is **asynchronous**.
198
+
199
+ #### Request
200
 
 
201
  ```http
202
  POST /upload HTTP/1.1
203
  Content-Type: multipart/form-data
 
 
 
204
 
205
  file: [binary PDF data]
 
206
  ```
207
 
208
+ | Field | Type | Required | Constraints |
209
+ |-------|------|----------|-------------|
210
+ | `file` | binary | **Yes** | PDF only, max 10MB, must start with `%PDF` magic bytes |
211
+
212
+ #### Response (200 OK)
213
 
 
214
  ```json
215
  {
216
  "job_id": "a1b2c3d4e5f6g7h8i9j0k1l2",
217
  "status": "queued",
218
+ "filename": "my_notes.pdf",
219
+ "size": 2457600,
220
+ "temp_cluster": "temp_a1b2c3d4e5f6g7h8i9j0k1l2"
221
  }
222
  ```
223
 
224
+ > [!IMPORTANT]
225
+ > **Critical fields for frontend:**
226
+ > - **`job_id`**: Use this to poll `/jobs/{job_id}` for processing status
227
+ > - **`temp_cluster`**: **SAVE THIS!** Use it in `/ask` requests to query this PDF
228
+
229
+ #### Processing Pipeline
230
+
231
+ 1. **Validation**: MIME type, magic bytes, size check
232
+ 2. **Chunking**: Split by page boundaries with context overlap
233
+ 3. **Embedding**: Vectorize using `gte-modernbert-base`
234
+ 4. **Storage**: Upsert to Qdrant under `temp_cluster` collection
235
+
236
+ #### Job Status Values
237
+
238
+ | Status | Description |
239
+ |--------|-------------|
240
+ | `queued` | Accepted, awaiting processing |
241
+ | `processing` | Currently being chunked/embedded |
242
+ | `done` | Ready for queries |
243
+ | `failed` | Check `error` field for details |
244
+
245
  ---
246
 
247
  ### GET /jobs/{job_id}
248
 
249
+ Poll the status of a PDF processing job.
250
 
 
 
 
 
 
 
 
 
 
251
  ```json
252
  {
253
  "job_id": "a1b2c3d4e5f6g7h8i9j0k1l2",
 
 
 
 
254
  "status": "done",
255
+ "temp_cluster": "temp_a1b2c3d4e5f6g7h8i9j0k1l2",
256
+ "filename": "my_notes.pdf",
257
+ "error": null
258
  }
259
  ```
260
 
 
 
 
 
 
 
261
  ---
262
 
263
  ### GET /jobs
264
 
265
+ List all jobs for the authenticated API key.
266
 
267
+ ---
268
+
269
+ ## Frontend Integration Guide
270
+
271
+ > [!IMPORTANT]
272
+ > This section provides implementation guidance for frontend developers.
273
+
274
+ ### Chat Session State Model
275
+
276
+ ```typescript
277
+ interface ChatSession {
278
+ // User selection (standard mode)
279
+ subject: string | null; // e.g., "evs", "physics"
280
+ unit: number | null; // 1-4
281
+
282
+ // Custom upload (custom mode)
283
+ tempCluster: string | null; // From /upload response
284
+ uploadJobId: string | null; // For status polling
285
+
286
+ // Mode lock
287
+ isCustomMode: boolean; // Once PDF uploaded, lock to custom mode
288
+ }
289
  ```
290
 
291
+ ### Flow 1: Standard Query (Pre-indexed Materials)
292
+
293
+ ```
294
+ ┌─────────────────────────────────────────────────────────┐
295
+ │ User selects Subject: [EVS ▼] and Unit: [2 ▼] │
296
+ │ ───────────────────────────────────────────────────── │
297
+ │ User types: "What are ecosystem types?"
298
+ │ │
299
+ │ → POST /ask { query, subject: "evs", unit: 2 } │
300
+ │ ← Response with answer + sources │
301
+ └─────────────────────────────────────────────────────────┘
302
+ ```
303
+
304
+ ### Flow 2: Custom PDF Upload
305
+
306
+ ```
307
+ ┌─────────────────────────────────────────────────────────┐
308
+ │ Step 1: User uploads PDF │
309
+ │ ───────────────────────────────────────────────────── │
310
+ │ → POST /upload (multipart/form-data) │
311
+ │ ← { job_id, temp_cluster, status: "queued" } │
312
+ │ │
313
+ │ ⚠️ SAVE: temp_cluster = "temp_abc123..." │
314
+ │ ⚠️ LOCK: subject/unit dropdowns (disable them) │
315
+ └─────────────────────────────────────────────────────────┘
316
+
317
+ ┌─────────────────────────────────────────────────────────┐
318
+ │ Step 2: Poll for completion │
319
+ │ ───────────────────────────────────────────────────── │
320
+ │ Loop every 2-3 seconds: │
321
+ │ → GET /jobs/{job_id} │
322
+ │ ← { status: "processing" | "done" | "failed" } │
323
+ │ │
324
+ │ When status === "done": Enable chat input │
325
+ │ When status === "failed": Show error, unlock dropdowns │
326
+ └─────────────────────────────────────────────────────────┘
327
+
328
+ ┌─────────────────────────────────────────────────────────┐
329
+ │ Step 3: Query the uploaded PDF │
330
+ │ ───────────────────────────────────────────────────── │
331
+ │ User types: "Summarize the main points" │
332
+ │ │
333
+ │ → POST /ask { query, cluster: "temp_abc123..." } │
334
+ │ ← Response with answer + sources from their PDF │
335
+ │ │
336
+ │ ⚠️ Keep using the same temp_cluster for all queries │
337
+ │ in this chat session │
338
+ └─────────────────────────────────────────────────────────┘
339
+ ```
340
+
341
+ ### UI State Logic
342
+
343
+ ```typescript
344
+ // When user uploads a PDF
345
+ async function handlePdfUpload(file: File) {
346
+ const formData = new FormData();
347
+ formData.append('file', file);
348
+
349
+ const response = await fetch('/upload', {
350
+ method: 'POST',
351
+ headers: generateAuthHeaders('POST', '/upload'),
352
+ body: formData
353
+ });
354
+ const data = await response.json();
355
+
356
+ // CRITICAL: Store these values in session state
357
+ session.tempCluster = data.temp_cluster; // ← SAVE THIS
358
+ session.uploadJobId = data.job_id;
359
+ session.isCustomMode = true; // ← LOCK MODE
360
+
361
+ // Disable subject/unit dropdowns in UI
362
+ disableSubjectUnitSelectors();
363
+
364
+ // Start polling
365
+ pollJobStatus(data.job_id);
366
+ }
367
+
368
+ // When sending a query
369
+ async function sendQuery(query: string) {
370
+ let payload;
371
+
372
+ if (session.isCustomMode && session.tempCluster) {
373
+ // Custom mode: use cluster
374
+ payload = {
375
+ query: query,
376
+ cluster: session.tempCluster // ← USE STORED VALUE
377
+ };
378
+ } else {
379
+ // Standard mode: use subject + unit
380
+ payload = {
381
+ query: query,
382
+ subject: session.subject,
383
+ unit: session.unit
384
+ };
385
+ }
386
+
387
+ const response = await fetch('/ask', {
388
+ method: 'POST',
389
+ headers: {
390
+ 'Content-Type': 'application/json',
391
+ ...generateAuthHeaders('POST', '/ask')
392
  },
393
+ body: JSON.stringify(payload)
394
+ });
395
+
396
+ return await response.json();
 
 
 
 
 
 
 
397
  }
398
  ```
399
 
400
+ ### Subject/Unit Locking Rules
401
+
402
+ | Scenario | Subject Dropdown | Unit Dropdown | Upload Button |
403
+ |----------|-----------------|---------------|---------------|
404
+ | Fresh chat session | ✅ Enabled | ✅ Enabled | ✅ Enabled |
405
+ | After selecting subject/unit | ✅ Enabled (can change) | ✅ Enabled | ✅ Enabled |
406
+ | After uploading PDF | ❌ **Disabled** | ❌ **Disabled** | ❌ Disabled |
407
+ | After upload fails | ✅ Re-enabled | ✅ Re-enabled | ✅ Re-enabled |
408
+ | New chat started | ✅ Enabled (reset) | ✅ Enabled | ✅ Enabled |
409
+
410
+ > [!CAUTION]
411
+ > Once a user uploads a PDF in a chat session, **ALL subsequent queries in that session MUST use the `cluster` parameter**, not `subject`/`unit`. The `temp_cluster` is tied to their uploaded document.
412
+
413
+ ### Available Subjects & Units
414
+
415
+ | Subject | Units Available | Collection Pattern |
416
+ |---------|-----------------|-------------------|
417
+ | `evs` | 1, 2, 3, 4 | `askbookie_evs_unit-{N}` |
418
+ | `physics` | 1, 2, 3, 4 | `askbookie_physics_unit-{N}` |
419
+ | *other subjects* | 1-4 | `askbookie_{subject}_unit-{N}` |
420
+
421
+ ### Answer Formatting
422
+
423
+ Answers are returned in **Markdown** with **LaTeX** support:
424
+ - Inline math: `$E = mc^2$`
425
+ - Block math: `$$\int_0^1 x^2 dx$$`
426
+
427
+ Use a Markdown renderer with KaTeX/MathJax integration.
428
+
429
+ ---
430
+
431
  ## System Endpoints
432
 
433
  ### GET /health
434
 
435
+ Service health check. **No authentication required.**
436
 
 
437
  ```json
438
  {
439
  "status": "healthy",
440
  "uptime_hours": 48.5,
 
 
 
 
441
  "current_model": {
442
+ "model_id": 1,
443
  "name": "Gemini-3-flash",
444
  "description": "Gemini Primary API Key"
 
 
 
 
 
 
 
 
 
 
 
 
445
  }
446
  }
447
  ```
 
 
448
 
449
+ ### GET /
450
 
451
+ Returns dashboard HTML or service metadata.
 
 
 
 
 
 
452
 
453
+ ---
454
 
455
  ## Admin Endpoints
456
 
457
+ > [!NOTE]
458
+ > All admin endpoints require the `admin` API key.
 
459
 
460
+ ### GET /history
 
 
 
 
 
 
461
 
462
+ Paginated query history across all users.
 
 
 
463
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
464
  ### GET /admin/keys
465
 
466
+ List all API keys with status.
467
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
468
  ### POST /admin/keys/{key_id}/enable
469
 
470
+ Re-enable a disabled key.
471
 
 
 
 
 
 
 
 
 
472
  ### POST /admin/keys/{key_id}/disable
473
 
474
+ Disable an API key (cannot disable `admin`).
475
 
 
 
 
 
 
 
 
 
476
  ### GET /admin/models/current
477
 
478
+ Get current active model.
479
+
480
+ ### POST /admin/models/switch
481
 
 
482
  ```json
483
+ { "model_id": 2 }
 
 
 
 
484
  ```
 
 
485
 
486
+ Switch to a different LLM backend (1-5).
487
 
488
+ ---
 
 
 
 
 
 
489
 
490
+ ## Error Handling
491
+
492
+ All errors return:
493
+ ```json
494
+ { "detail": "Error description" }
495
  ```
496
 
497
+ | Code | Meaning |
498
+ |------|---------|
499
+ | 400 | Bad Request - Missing/invalid parameters |
500
+ | 401 | Unauthorized - Invalid signature or expired key |
501
+ | 403 | Forbidden - Admin endpoint accessed with non-admin key |
502
+ | 404 | Not Found - Job doesn't exist or wrong owner |
503
+ | 413 | Payload Too Large - PDF > 10MB or JSON > 16KB |
504
+ | 429 | Rate Limited - See `Retry-After` header |
505
+ | 500 | Internal Error - RAG pipeline failure |
506
+
507
+ ### Special 429 Cases
508
+
509
+ | Detail Message | Cause | Frontend Action |
510
+ |----------------|-------|-----------------|
511
+ | `"Rate limit exceeded"` | Too many requests | Wait 60s, show countdown |
512
+ | `"Too many concurrent uploads"` | 3+ uploads in progress | Wait for pending jobs |
513
+ | `"LLM quota exhausted"` | Model API limit hit | Retry in 1hr or notify user |
514
+ | `"Too many failed attempts"` | Auth lockout | Wait 5 minutes |
515
+
516
+ ---
517
 
518
+ ## Quick Reference: /ask Request Bodies
519
+
520
+ **Standard Mode:**
521
  ```json
522
  {
523
+ "query": "Your question here",
524
+ "subject": "evs",
525
+ "unit": 2
 
 
 
 
526
  }
527
  ```
 
 
 
 
528
 
529
+ **Custom Upload Mode:**
530
  ```json
531
  {
532
+ "query": "Your question here",
533
+ "cluster": "temp_a1b2c3d4e5f6g7h8i9j0k1l2"
534
  }
535
  ```
 
 
 
 
 
 
 
 
 
 
 
 
536
 
537
+ **❌ Invalid (mixing modes):**
538
+ ```json
539
+ {
540
+ "query": "question",
541
+ "subject": "evs",
542
+ "cluster": "temp_..."
 
 
 
 
543
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
544
  ```
pyproject.toml CHANGED
@@ -5,6 +5,7 @@ description = "API for AskBookie - a question answering system powered by LangCh
5
  readme = "README.md"
6
  requires-python = ">=3.10,<3.14"
7
  dependencies = [
 
8
  "fastapi>=0.128.0",
9
  "g4f>=6.8.3",
10
  "langchain-community>=0.4.1",
 
5
  readme = "README.md"
6
  requires-python = ">=3.10,<3.14"
7
  dependencies = [
8
+ "certifi>=2025.11.12",
9
  "fastapi>=0.128.0",
10
  "g4f>=6.8.3",
11
  "langchain-community>=0.4.1",
src/database.py CHANGED
@@ -1,6 +1,8 @@
1
  import os
2
  import time
3
  import logging
 
 
4
  from datetime import datetime, timezone
5
  from typing import Optional, List
6
 
@@ -23,9 +25,10 @@ def get_database():
23
  try:
24
  _client = MongoClient(
25
  MONGODB_URI,
26
- serverSelectionTimeoutMS=5000,
27
  tls=True,
28
- tlsAllowInvalidCertificates=True
 
29
  )
30
  _client.admin.command('ping')
31
  _db = _client.askbookie
 
1
  import os
2
  import time
3
  import logging
4
+ import ssl
5
+ import certifi
6
  from datetime import datetime, timezone
7
  from typing import Optional, List
8
 
 
25
  try:
26
  _client = MongoClient(
27
  MONGODB_URI,
28
+ serverSelectionTimeoutMS=10000,
29
  tls=True,
30
+ tlsAllowInvalidCertificates=True,
31
+ tlsCAFile=certifi.where()
32
  )
33
  _client.admin.command('ping')
34
  _db = _client.askbookie
src/rag.py CHANGED
@@ -225,6 +225,37 @@ class RAGService:
225
 
226
  return {"answer": answer, "sources": sources}
227
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
228
 
229
  def process_pdf(file_path: str, original_filename: str, subject: str, status_callback=None, temp_cluster_id: str = None) -> str:
230
  qdrant_url = os.getenv("QDRANT_CLUSTER_URL")
 
225
 
226
  return {"answer": answer, "sources": sources}
227
 
228
+ def ask_collection(self, query_text: str, collection_name: str):
229
+ """Query a specific collection by name (for custom uploads/temp clusters)."""
230
+ max_retries = 3
231
+ last_error = None
232
+
233
+ for attempt in range(max_retries):
234
+ try:
235
+ client = QdrantClient(url=self.qdrant_url, api_key=self.qdrant_key, timeout=120)
236
+ vectorstore = QdrantVectorStore(client=client, collection_name=collection_name, embedding=self.embeddings)
237
+ results = vectorstore.similarity_search_with_score(query_text, k=5)
238
+ break
239
+ except Exception as e:
240
+ last_error = e
241
+ if attempt < max_retries - 1:
242
+ time.sleep(2 ** attempt)
243
+ continue
244
+ raise last_error
245
+
246
+ top_results = results[:5]
247
+ context_text = "\n\n---\n\n".join([doc.page_content for doc, _ in top_results])
248
+
249
+ full_prompt = PROMPT_TEMPLATE.format(context=context_text, question=query_text)
250
+ answer = call_llm(full_prompt)
251
+
252
+ sources = [
253
+ f"{doc.metadata.get('source', 'Unknown')}: Slide {doc.metadata.get('slide_number', 'Unknown')}"
254
+ for doc, _ in top_results
255
+ ]
256
+
257
+ return {"answer": answer, "sources": sources}
258
+
259
 
260
  def process_pdf(file_path: str, original_filename: str, subject: str, status_callback=None, temp_cluster_id: str = None) -> str:
261
  qdrant_url = os.getenv("QDRANT_CLUSTER_URL")
uv.lock CHANGED
@@ -153,6 +153,7 @@ name = "askbookie-api"
153
  version = "0.1.0"
154
  source = { virtual = "." }
155
  dependencies = [
 
156
  { name = "fastapi" },
157
  { name = "g4f" },
158
  { name = "langchain-community" },
@@ -173,6 +174,7 @@ dependencies = [
173
 
174
  [package.metadata]
175
  requires-dist = [
 
176
  { name = "fastapi", specifier = ">=0.128.0" },
177
  { name = "g4f", specifier = ">=6.8.3" },
178
  { name = "langchain-community", specifier = ">=0.4.1" },
 
153
  version = "0.1.0"
154
  source = { virtual = "." }
155
  dependencies = [
156
+ { name = "certifi" },
157
  { name = "fastapi" },
158
  { name = "g4f" },
159
  { name = "langchain-community" },
 
174
 
175
  [package.metadata]
176
  requires-dist = [
177
+ { name = "certifi", specifier = ">=2025.11.12" },
178
  { name = "fastapi", specifier = ">=0.128.0" },
179
  { name = "g4f", specifier = ">=6.8.3" },
180
  { name = "langchain-community", specifier = ">=0.4.1" },