agentic-rag / docs /Backend Testing Guide.md
vksepm
updated docs
992873f
|
Raw
History Blame Contribute Delete
21.7 kB
# Backend API Testing Guide
End-to-end validation of the Agentic-RAG backend using [HTTPie](https://httpie.io/cli) (`http` command).
For ad-hoc **Azure OpenAI** and **TruLens** troubleshooting scripts run inside the Docker backend, see [test-scripts-troubleshooting.md](./test-scripts-troubleshooting.md).
---
## Prerequisites
### Install HTTPie
```bash
# macOS
brew install httpie
# Linux / WSL
pip install httpie
# Windows (PowerShell)
winget install httpie.httpie
```
Verify: `http --version` → should print `3.x.x`.
### Base URL
All examples use the backend running at `http://localhost:8000`.
Set a shell variable for convenience:
```bash
BASE=http://localhost:8000/api/v1
```
### Test accounts (pre-seeded)
| Email | Password | Role |
| ----------------------------- | --------------- | ---------- |
| `researcher@example.com` | `researcher123` | researcher |
| `test_researcher@example.com` | `wrong123` | researcher |
> **Researcher** role has access to 55 documents / 5 000 chunks in the vector store.
---
## 1. Health Check
```bash
http GET $BASE/health
```
**Expected**
```http
HTTP/1.1 200 OK
Content-Type: application/json
{
"status": "ok"
}
```
---
## 2. Authentication
### 2.1 Register — success (201)
```bash
http POST $BASE/auth/register \
email="newuser@example.com" \
password="secret123" \
display_name="New User" \
role="researcher"
```
**Expected**
```http
HTTP/1.1 201 Created
Content-Type: application/json
{
"id": "<uuid>",
"email": "newuser@example.com",
"display_name": "New User",
"is_active": true,
"created_at": "2026-03-20T15:00:00Z"
}
```
### 2.2 Register — duplicate email (400)
```bash
http POST $BASE/auth/register \
email="researcher@example.com" \
password="anything"
```
**Expected**
```http
HTTP/1.1 400 Bad Request
{
"detail": "Email already registered"
}
```
### 2.3 Register — invalid email format (422)
```bash
http POST $BASE/auth/register \
email="not-an-email" \
password="secret123"
```
**Expected**
```http
HTTP/1.1 422 Unprocessable Entity
{
"detail": [
{
"type": "value_error",
"loc": ["body", "email"],
"msg": "value is not a valid email address"
}
]
}
```
---
### 2.4 Login — correct credentials (200)
> Login uses `application/x-www-form-urlencoded` (OAuth2 password flow), so pass `-f`.
```bash
http -f POST $BASE/auth/login \
username="researcher@example.com" \
password="researcher123"
```
**Expected**
```http
HTTP/1.1 200 OK
{
"access_token": "eyJhbGci...",
"token_type": "bearer"
}
```
**Capture token for subsequent requests**
```bash
TOKEN=$(http -f POST $BASE/auth/login \
username="researcher@example.com" \
password="researcher123" \
| python -c "import sys,json; print(json.load(sys.stdin)['access_token'])")
echo $TOKEN
```
### 2.5 Login — wrong password (401)
```bash
http -f POST $BASE/auth/login \
username="researcher@example.com" \
password="wrongpassword"
```
**Expected**
```http
HTTP/1.1 401 Unauthorized
{
"detail": "Incorrect email or password"
}
```
### 2.6 Login — unknown user (401)
```bash
http -f POST $BASE/auth/login \
username="nobody@example.com" \
password="anything"
```
**Expected** — same 401 response (no user enumeration).
---
### 2.7 Get Current User — valid token (200)
```bash
http GET $BASE/auth/me \
"Authorization: Bearer $TOKEN"
```
**Expected**
```http
HTTP/1.1 200 OK
{
"id": "<uuid>",
"email": "researcher@example.com",
"display_name": "Test Researcher",
"is_active": true,
"created_at": "..."
}
```
### 2.8 Get Current User — no token (401)
```bash
http GET $BASE/auth/me
```
**Expected**
```http
HTTP/1.1 401 Unauthorized
{
"detail": "Not authenticated"
}
```
### 2.9 Get Current User — malformed token (401)
```bash
http GET $BASE/auth/me \
"Authorization: Bearer not.a.valid.jwt"
```
**Expected**
```http
HTTP/1.1 401 Unauthorized
{
"detail": "Could not validate credentials"
}
```
---
## 3. Query
All query endpoints require a valid JWT. Run the token capture from §2.4 first.
### 3.1 Submit Query — success (200)
```bash
http POST $BASE/query \
"Authorization: Bearer $TOKEN" \
query="What are the key differences between BERT and GPT architectures?"
```
**Expected** (~60–120 s — the agent embeds, searches pgvector, reranks with CrossEncoder, then calls the LLM)
```http
HTTP/1.1 200 OK
{
"id": "<uuid>",
"query": "What are the key differences between BERT and GPT architectures?",
"answer": "BERT is an encoder-only Transformer ... GPT is decoder-only ...",
"citations": [
{
"index": 1,
"title": "Attention Is All You Need",
"source_url": "https://arxiv.org/abs/1706.03762",
"full_citation": "Vaswani et al. arXiv:1706.03762, 2017."
}
],
"chart_data": null,
"model_provider": "openai",
"agent_steps": 5,
"created_at": "..."
}
```
**Validation checklist**
- [ ] `answer` is non-empty and references the query topic
- [ ] `citations` list has ≥ 1 entry with valid `source_url` starting with `https://arxiv.org/`
- [ ] `agent_steps` is 1–5
- [ ] `model_provider` matches `MODEL_PROVIDER` in `.env`
---
### 3.2 Submit Query — numerical data triggers chart_data
Ask a question whose answer contains benchmark numbers so the agent populates `chart_data`:
```bash
http POST $BASE/query \
"Authorization: Bearer $TOKEN" \
query="Compare the accuracy scores of LoRA vs full fine-tuning on common NLP benchmarks"
```
**Expected**`chart_data` is a Plotly spec (not `null`):
```json
{
"chart_data": {
"data": [
{
"type": "bar",
"x": ["MNLI", "SST-2", "MRPC"],
"y": [91.7, 96.2, 90.1]
}
],
"layout": {
"title": "LoRA vs Full Fine-Tuning Benchmark Scores"
}
}
}
```
---
### 3.3 Submit Query — empty query string (422)
```bash
http POST $BASE/query \
"Authorization: Bearer $TOKEN" \
query=" "
```
**Expected**
```http
HTTP/1.1 422 Unprocessable Entity
{
"detail": "Query must not be empty"
}
```
### 3.4 Submit Query — missing `query` field (422)
```bash
http POST $BASE/query \
"Authorization: Bearer $TOKEN" \
Content-Type:application/json \
<<< '{}'
```
**Expected**
```http
HTTP/1.1 422 Unprocessable Entity
{
"detail": [
{
"type": "missing",
"loc": ["body", "query"],
"msg": "Field required"
}
]
}
```
### 3.5 Submit Query — no authentication (401)
```bash
http POST $BASE/query \
query="What is attention mechanism?"
```
**Expected**
```http
HTTP/1.1 401 Unauthorized
{
"detail": "Not authenticated"
}
```
---
## 4. Query History
### 4.1 List history — returns entries in reverse-chronological order (200)
```bash
http GET $BASE/query/history \
"Authorization: Bearer $TOKEN"
```
**Expected**
```http
HTTP/1.1 200 OK
[
{
"id": "<uuid>",
"query_text": "What are the key differences between BERT and GPT architectures?",
"response_text": "BERT is an encoder-only ...",
"model_provider": "openai",
"agent_steps": 5,
"created_at": "..."
}
]
```
### 4.2 Pagination — limit and offset
```bash
# First page: 2 items
http GET "$BASE/query/history?limit=2&offset=0" \
"Authorization: Bearer $TOKEN"
# Second page: next 2
http GET "$BASE/query/history?limit=2&offset=2" \
"Authorization: Bearer $TOKEN"
```
**Validation checklist**
- [ ] `limit` controls list length (≤ N items returned)
- [ ] `offset` skips earlier entries
- [ ] Items are ordered newest-first
### 4.3 New user sees empty history
```bash
# Register a fresh user
http POST $BASE/auth/register \
email="freshuser@example.com" \
password="pass1234"
# Login
FRESH_TOKEN=$(http -f POST $BASE/auth/login \
username="freshuser@example.com" \
password="pass1234" \
| python -c "import sys,json; print(json.load(sys.stdin)['access_token'])")
# History should be empty
http GET $BASE/query/history \
"Authorization: Bearer $FRESH_TOKEN"
```
**Expected**`[]` (empty array, HTTP 200).
---
## 5. On-Demand Visualization
The visualization endpoint is driven by `include_visualization: true` on the POST body. The frontend polls every 2 s for up to **180 s** (3 minutes); these tests verify the full flow manually.
### 5.1 Submit query with visualization enabled
```bash
http POST $BASE/query \
"Authorization: Bearer $TOKEN" \
query="Compare the accuracy scores of LoRA vs full fine-tuning on common NLP benchmarks" \
include_visualization:=true
```
**Expected** — same response shape as §3.1; returns immediately (text response is NOT delayed by viz):
```http
HTTP/1.1 200 OK
{
"id": "<uuid>",
"query": "...",
"answer": "...",
"citations": [...],
"chart_data": null,
"model_provider": "openai",
"agent_steps": 5,
"created_at": "..."
}
```
**Capture the query ID:**
```bash
QUERY_ID=$(http POST $BASE/query \
"Authorization: Bearer $TOKEN" \
query="Compare LoRA vs full fine-tuning accuracy" \
include_visualization:=true \
| python -c "import sys,json; print(json.load(sys.stdin)['id'])")
echo $QUERY_ID
```
---
### 5.2 Poll visualization — pending
Immediately after submitting (before the viz agent finishes):
```bash
http GET $BASE/query/$QUERY_ID/visualization \
"Authorization: Bearer $TOKEN"
```
**Expected:**
```http
HTTP/1.1 200 OK
{
"status": "pending",
"chart_data": null,
"error": null
}
```
---
### 5.3 Poll visualization — complete
After waiting ~10–60 seconds for the VizCodeAgent to finish (complex charts like sunbursts take longer):
```bash
http GET $BASE/query/$QUERY_ID/visualization \
"Authorization: Bearer $TOKEN"
```
**Expected:**
```http
HTTP/1.1 200 OK
{
"status": "complete",
"chart_data": {
"data": [
{
"type": "bar",
"x": ["MNLI", "SST-2", "MRPC"],
"y": [91.7, 96.2, 90.1],
"name": "LoRA"
},
{
"type": "bar",
"x": ["MNLI", "SST-2", "MRPC"],
"y": [92.1, 96.8, 90.9],
"name": "Full Fine-Tuning"
}
],
"layout": {
"title": "LoRA vs Full Fine-Tuning Accuracy",
"barmode": "group"
}
},
"error": null
}
```
**Validation checklist:**
- [ ] `status` is `"complete"`
- [ ] `chart_data` is a valid Plotly spec with `data` (array) and `layout` (object) keys
- [ ] `data[*].type` is a recognised Plotly chart type (`bar`, `scatter`, `line`, etc.)
---
### 5.4 Poll visualization — query submitted without viz flag returns 404
```bash
# Submit without include_visualization
PLAIN_ID=$(http POST $BASE/query \
"Authorization: Bearer $TOKEN" \
query="What is BERT?" \
| python -c "import sys,json; print(json.load(sys.stdin)['id'])")
http GET $BASE/query/$PLAIN_ID/visualization \
"Authorization: Bearer $TOKEN"
```
**Expected:**
```http
HTTP/1.1 404 Not Found
{
"detail": "Visualization not found — not requested, not yet started, or expired"
}
```
---
### 5.5 Poll visualization — no authentication (401)
```bash
http GET $BASE/query/$QUERY_ID/visualization
```
**Expected:**
```http
HTTP/1.1 401 Unauthorized
{
"detail": "Not authenticated"
}
```
---
### 5.6 Submit query with viz flag — factual query with no data (NO_CHART)
Some queries produce an answer without numerical data. The viz agent should return `NO_CHART`:
```bash
http POST $BASE/query \
"Authorization: Bearer $TOKEN" \
query="What is the intuition behind the attention mechanism?" \
include_visualization:=true
```
After the viz agent completes (~5–15 s), poll:
```bash
http GET $BASE/query/$QUERY_ID/visualization \
"Authorization: Bearer $TOKEN"
```
**Expected** — status complete but no chart:
```json
{
"status": "complete",
"chart_data": null,
"error": null
}
```
---
## 6. Settings API
The settings endpoints let authenticated users read and update application configuration at runtime. Changes are written to the `.env` file inside the backend container and take effect immediately via `get_settings.cache_clear()` — no container restart required.
> Sensitive keys (API keys) are **partially masked** in all GET responses (`sk-ab****`). Any value sent to PUT that ends with `****` is treated as a no-op sentinel — the existing key is preserved.
### 6.1 Get current settings (200)
```bash
http GET $BASE/settings \
"Authorization: Bearer $TOKEN"
```
**Expected**
```http
HTTP/1.1 200 OK
{
"model_provider": "openai",
"openai_api_key": "sk-p****",
"openai_model": "gpt-4o",
"azure_openai_api_key": "",
"azure_openai_endpoint": "",
"azure_openai_deployment": "gpt-4.1-mini-2025-04-14",
"azure_openai_api_version": "2025-04-14",
"google_api_key": "",
"gemini_model": "gemini/gemini-flash-lite-latest",
"trulens_provider": "openai",
"trulens_strategy": "async",
"trulens_sample_rate": 1,
"trulens_feedback_timeout": 180.0,
"viz_model_provider": "openai",
"viz_model_name": "gpt-4o-mini",
"viz_azure_deployment": "",
"viz_azure_api_version": ""
}
```
**Validation checklist**
- [ ] API keys that are set show as `"<prefix>****"` (never the full value)
- [ ] API keys that are not set show as `""` (empty string)
- [ ] `model_provider` matches `MODEL_PROVIDER` in `.env`
---
### 6.2 Update provider and model (200)
```bash
http PUT $BASE/settings \
"Authorization: Bearer $TOKEN" \
model_provider="gemini" \
google_api_key="AIzaSy..." \
gemini_model="gemini/gemini-2.0-flash"
```
**Expected** — returns the updated settings with the key masked:
```http
HTTP/1.1 200 OK
{
"model_provider": "gemini",
"google_api_key": "AIza****",
...
}
```
**Validation checklist**
- [ ] `model_provider` reflects the new value
- [ ] `google_api_key` is now masked (not empty)
- [ ] Immediately submit a query to confirm the new provider is active (no restart needed)
---
### 6.3 Masked-key sentinel — existing key is preserved
Send back the masked placeholder unchanged; the backend should not overwrite the key:
```bash
# 1. Capture the current masked value
MASKED=$(http GET $BASE/settings "Authorization: Bearer $TOKEN" \
| python -c "import sys,json; print(json.load(sys.stdin)['openai_api_key'])")
# 2. PUT with the masked value — key should be unchanged
http PUT $BASE/settings \
"Authorization: Bearer $TOKEN" \
openai_api_key="$MASKED" \
openai_model="gpt-4o-mini"
```
**Expected** — 200 OK, `openai_api_key` still masked (not cleared), `openai_model` updated.
---
### 6.4 Get settings — no authentication (401)
```bash
http GET $BASE/settings
```
**Expected**
```http
HTTP/1.1 401 Unauthorized
{
"detail": "Not authenticated"
}
```
---
## 7. RBAC Verification
The vector search is filtered at the SQL level by the user's roles. A `guest` user with no role-document mappings should get "No relevant documents found" from the retriever.
### 6.1 Register a guest user (default role)
```bash
http POST $BASE/auth/register \
email="guest@example.com" \
password="guest123"
# role omitted → defaults to "guest"
```
### 6.2 Login as guest
```bash
GUEST_TOKEN=$(http -f POST $BASE/auth/login \
username="guest@example.com" \
password="guest123" \
| python -c "import sys,json; print(json.load(sys.stdin)['access_token'])")
```
### 6.3 Query as guest — expects no documents
```bash
http POST $BASE/query \
"Authorization: Bearer $GUEST_TOKEN" \
query="What is BERT?"
```
**Expected** — agent returns an answer noting no documents were found (HTTP 200, but answer states retriever returned empty context). The `citations` list will be empty (`[]`).
---
## 8. Error Summary Table
| # | Method | Endpoint | Scenario | Expected HTTP |
| --- | ------ | --------------------------------- | -------------------------------- | ------------- |
| 1 | GET | `/health` | normal | 200 |
| 2 | POST | `/auth/register` | new user | 201 |
| 3 | POST | `/auth/register` | duplicate email | 400 |
| 4 | POST | `/auth/register` | invalid email | 422 |
| 5 | POST | `/auth/login` | correct credentials | 200 |
| 6 | POST | `/auth/login` | wrong password | 401 |
| 7 | GET | `/auth/me` | valid token | 200 |
| 8 | GET | `/auth/me` | no token | 401 |
| 9 | GET | `/auth/me` | malformed token | 401 |
| 10 | POST | `/query` | valid query | 200 |
| 11 | POST | `/query` | empty string | 422 |
| 12 | POST | `/query` | missing field | 422 |
| 13 | POST | `/query` | no auth | 401 |
| 14 | GET | `/query/history` | with auth | 200 |
| 15 | GET | `/query/history` | no auth | 401 |
| 16 | POST | `/query` | `include_visualization: true` | 200 (immediate text response) |
| 17 | GET | `/query/{id}/visualization` | pending (viz in progress) | 200 `{status:"pending"}` |
| 18 | GET | `/query/{id}/visualization` | complete | 200 `{status:"complete", chart_data:{...}}` |
| 19 | GET | `/query/{id}/visualization` | viz not requested (no flag) | 404 |
| 20 | GET | `/query/{id}/visualization` | no auth | 401 |
| 21 | GET | `/settings` | authenticated | 200 |
| 22 | GET | `/settings` | no auth | 401 |
| 23 | PUT | `/settings` | valid payload | 200 |
| 24 | PUT | `/settings` | masked-key sentinel | 200 (key unchanged) |
| 25 | PUT | `/settings` | no auth | 401 |
---
## 9. Verbose Mode & Inspecting Headers
Add `-v` to see full request/response headers, useful for debugging CORS or auth issues:
```bash
http -v GET $BASE/auth/me \
"Authorization: Bearer $TOKEN"
```
Check only response headers (no body):
```bash
http -h POST $BASE/query \
"Authorization: Bearer $TOKEN" \
query="test"
```
---
## 10. Interactive API Docs
FastAPI serves two interactive UIs automatically:
| UI | URL |
| ---------------- | ---------------------------------- |
| Swagger UI | http://localhost:8000/docs |
| ReDoc | http://localhost:8000/redoc |
| Raw OpenAPI JSON | http://localhost:8000/openapi.json |
The saved snapshot is at `docs/openapi.json`.
---
## 11. TruLens Evaluation Verification
After running a query, TruLens scores are persisted asynchronously to the `evaluation_results` table. Check them directly:
```bash
docker exec agentic-rag-db-1 psql -U postgres -d agentic_rag -c "
SELECT
ql.query_text,
er.relevance_score,
er.groundedness_score,
er.answer_relevance_score,
er.created_at
FROM evaluation_results er
JOIN query_logs ql ON ql.id = er.query_log_id
ORDER BY er.created_at DESC
LIMIT 5;"
```
**Target scores** (from SLA):
| Metric | Target |
| ----------------- | ------ |
| Context Relevance | > 0.85 |
| Groundedness | > 0.90 |
| Answer Relevance | > 0.85 |
> Scores are written in a **background thread** after the HTTP response. The worker waits up to **`TRULENS_FEEDBACK_TIMEOUT`** seconds (default **180**) for TruLens to finish all three judge calls via `retrieve_feedback_results` before persisting to `evaluation_results`. Slow LLM proxies may need a higher timeout in `.env`.
---
## 12. Performance Targets
The pipeline SLA from the spec:
| Stage | Target |
| ----------------------------------- | -------- |
| Full pipeline (end-to-end) | < 6 s |
| LLM generation | < 3 s |
| CrossEncoder reranking (20 pairs) | < 500 ms |
| pgvector HNSW search (100k vectors) | < 50 ms |
> **Note:** The CrossEncoder model (`cross-encoder/ms-marco-MiniLM-L-6-v2`, ~85 MB) is loaded once at first request and cached as a process singleton. First-query latency on cold start (~30 s on CPU) is expected; subsequent queries meet the SLA.
Measure end-to-end latency with HTTPie's `--print=h` and the response `Date` header, or use `time`:
```bash
time http POST $BASE/query \
"Authorization: Bearer $TOKEN" \
query="What is retrieval-augmented generation?"
```