Spaces:

DreamyDetective
/

agentic-rag

Paused

App Files Files Community

agentic-rag / docs /Backend Testing Guide.md

vksepm

updated docs

992873f 3 months ago

preview code

Raw

History Blame Contribute Delete

21.7 kB

	# Backend API Testing Guide

	End-to-end validation of the Agentic-RAG backend using [HTTPie](https://httpie.io/cli) (`http` command).

	For ad-hoc Azure OpenAI and TruLens troubleshooting scripts run inside the Docker backend, see [test-scripts-troubleshooting.md](./test-scripts-troubleshooting.md).

	---

	## Prerequisites

	### Install HTTPie

	```bash
	# macOS
	brew install httpie

	# Linux / WSL
	pip install httpie

	# Windows (PowerShell)
	winget install httpie.httpie
	```

	Verify: `http --version` → should print `3.x.x`.

	### Base URL

	All examples use the backend running at `http://localhost:8000`.
	Set a shell variable for convenience:

	```bash
	BASE=http://localhost:8000/api/v1
	```

	### Test accounts (pre-seeded)

	\| Email \| Password \| Role \|
	\| ----------------------------- \| --------------- \| ---------- \|
	\| `researcher@example.com` \| `researcher123` \| researcher \|
	\| `test_researcher@example.com` \| `wrong123` \| researcher \|

	> Researcher role has access to 55 documents / 5 000 chunks in the vector store.

	---

	## 1. Health Check

	```bash
	http GET $BASE/health
	```

	Expected

	```http
	HTTP/1.1 200 OK
	Content-Type: application/json

	{
	"status": "ok"
	}
	```

	---

	## 2. Authentication

	### 2.1 Register — success (201)

	```bash
	http POST $BASE/auth/register \
	email="newuser@example.com" \
	password="secret123" \
	display_name="New User" \
	role="researcher"
	```

	Expected

	```http
	HTTP/1.1 201 Created
	Content-Type: application/json

	{
	"id": "<uuid>",
	"email": "newuser@example.com",
	"display_name": "New User",
	"is_active": true,
	"created_at": "2026-03-20T15:00:00Z"
	}
	```

	### 2.2 Register — duplicate email (400)

	```bash
	http POST $BASE/auth/register \
	email="researcher@example.com" \
	password="anything"
	```

	Expected

	```http
	HTTP/1.1 400 Bad Request

	{
	"detail": "Email already registered"
	}
	```

	### 2.3 Register — invalid email format (422)

	```bash
	http POST $BASE/auth/register \
	email="not-an-email" \
	password="secret123"
	```

	Expected

	```http
	HTTP/1.1 422 Unprocessable Entity

	{
	"detail": [
	{
	"type": "value_error",
	"loc": ["body", "email"],
	"msg": "value is not a valid email address"
	}
	]
	}
	```

	---

	### 2.4 Login — correct credentials (200)

	> Login uses `application/x-www-form-urlencoded` (OAuth2 password flow), so pass `-f`.

	```bash
	http -f POST $BASE/auth/login \
	username="researcher@example.com" \
	password="researcher123"
	```

	Expected

	```http
	HTTP/1.1 200 OK

	{
	"access_token": "eyJhbGci...",
	"token_type": "bearer"
	}
	```

	Capture token for subsequent requests

	```bash
	TOKEN=$(http -f POST $BASE/auth/login \
	username="researcher@example.com" \
	password="researcher123" \
	\| python -c "import sys,json; print(json.load(sys.stdin)['access_token'])")
	echo $TOKEN
	```

	### 2.5 Login — wrong password (401)

	```bash
	http -f POST $BASE/auth/login \
	username="researcher@example.com" \
	password="wrongpassword"
	```

	Expected

	```http
	HTTP/1.1 401 Unauthorized

	{
	"detail": "Incorrect email or password"
	}
	```

	### 2.6 Login — unknown user (401)

	```bash
	http -f POST $BASE/auth/login \
	username="nobody@example.com" \
	password="anything"
	```

	Expected — same 401 response (no user enumeration).

	---

	### 2.7 Get Current User — valid token (200)

	```bash
	http GET $BASE/auth/me \
	"Authorization: Bearer $TOKEN"
	```

	Expected

	```http
	HTTP/1.1 200 OK

	{
	"id": "<uuid>",
	"email": "researcher@example.com",
	"display_name": "Test Researcher",
	"is_active": true,
	"created_at": "..."
	}
	```

	### 2.8 Get Current User — no token (401)

	```bash
	http GET $BASE/auth/me
	```

	Expected

	```http
	HTTP/1.1 401 Unauthorized

	{
	"detail": "Not authenticated"
	}
	```

	### 2.9 Get Current User — malformed token (401)

	```bash
	http GET $BASE/auth/me \
	"Authorization: Bearer not.a.valid.jwt"
	```

	Expected

	```http
	HTTP/1.1 401 Unauthorized

	{
	"detail": "Could not validate credentials"
	}
	```

	---

	## 3. Query

	All query endpoints require a valid JWT. Run the token capture from §2.4 first.

	### 3.1 Submit Query — success (200)

	```bash
	http POST $BASE/query \
	"Authorization: Bearer $TOKEN" \
	query="What are the key differences between BERT and GPT architectures?"
	```

	Expected (~60–120 s — the agent embeds, searches pgvector, reranks with CrossEncoder, then calls the LLM)

	```http
	HTTP/1.1 200 OK

	{
	"id": "<uuid>",
	"query": "What are the key differences between BERT and GPT architectures?",
	"answer": "BERT is an encoder-only Transformer ... GPT is decoder-only ...",
	"citations": [
	{
	"index": 1,
	"title": "Attention Is All You Need",
	"source_url": "https://arxiv.org/abs/1706.03762",
	"full_citation": "Vaswani et al. arXiv:1706.03762, 2017."
	}
	],
	"chart_data": null,
	"model_provider": "openai",
	"agent_steps": 5,
	"created_at": "..."
	}
	```

	Validation checklist

	- [ ] `answer` is non-empty and references the query topic
	- [ ] `citations` list has ≥ 1 entry with valid `source_url` starting with `https://arxiv.org/`
	- [ ] `agent_steps` is 1–5
	- [ ] `model_provider` matches `MODEL_PROVIDER` in `.env`

	---

	### 3.2 Submit Query — numerical data triggers chart_data

	Ask a question whose answer contains benchmark numbers so the agent populates `chart_data`:

	```bash
	http POST $BASE/query \
	"Authorization: Bearer $TOKEN" \
	query="Compare the accuracy scores of LoRA vs full fine-tuning on common NLP benchmarks"
	```

	Expected — `chart_data` is a Plotly spec (not `null`):

	```json
	{
	"chart_data": {
	"data": [
	{
	"type": "bar",
	"x": ["MNLI", "SST-2", "MRPC"],
	"y": [91.7, 96.2, 90.1]
	}
	],
	"layout": {
	"title": "LoRA vs Full Fine-Tuning Benchmark Scores"
	}
	}
	}
	```

	---

	### 3.3 Submit Query — empty query string (422)

	```bash
	http POST $BASE/query \
	"Authorization: Bearer $TOKEN" \
	query=" "
	```

	Expected

	```http
	HTTP/1.1 422 Unprocessable Entity

	{
	"detail": "Query must not be empty"
	}
	```

	### 3.4 Submit Query — missing `query` field (422)

	```bash
	http POST $BASE/query \
	"Authorization: Bearer $TOKEN" \
	Content-Type:application/json \
	<<< '{}'
	```

	Expected

	```http
	HTTP/1.1 422 Unprocessable Entity

	{
	"detail": [
	{
	"type": "missing",
	"loc": ["body", "query"],
	"msg": "Field required"
	}
	]
	}
	```

	### 3.5 Submit Query — no authentication (401)

	```bash
	http POST $BASE/query \
	query="What is attention mechanism?"
	```

	Expected

	```http
	HTTP/1.1 401 Unauthorized

	{
	"detail": "Not authenticated"
	}
	```

	---

	## 4. Query History

	### 4.1 List history — returns entries in reverse-chronological order (200)

	```bash
	http GET $BASE/query/history \
	"Authorization: Bearer $TOKEN"
	```

	Expected

	```http
	HTTP/1.1 200 OK

	[
	{
	"id": "<uuid>",
	"query_text": "What are the key differences between BERT and GPT architectures?",
	"response_text": "BERT is an encoder-only ...",
	"model_provider": "openai",
	"agent_steps": 5,
	"created_at": "..."
	}
	]
	```

	### 4.2 Pagination — limit and offset

	```bash
	# First page: 2 items
	http GET "$BASE/query/history?limit=2&offset=0" \
	"Authorization: Bearer $TOKEN"

	# Second page: next 2
	http GET "$BASE/query/history?limit=2&offset=2" \
	"Authorization: Bearer $TOKEN"
	```

	Validation checklist

	- [ ] `limit` controls list length (≤ N items returned)
	- [ ] `offset` skips earlier entries
	- [ ] Items are ordered newest-first

	### 4.3 New user sees empty history

	```bash
	# Register a fresh user
	http POST $BASE/auth/register \
	email="freshuser@example.com" \
	password="pass1234"

	# Login
	FRESH_TOKEN=$(http -f POST $BASE/auth/login \
	username="freshuser@example.com" \
	password="pass1234" \
	\| python -c "import sys,json; print(json.load(sys.stdin)['access_token'])")

	# History should be empty
	http GET $BASE/query/history \
	"Authorization: Bearer $FRESH_TOKEN"
	```

	Expected — `[]` (empty array, HTTP 200).

	---

	## 5. On-Demand Visualization

	The visualization endpoint is driven by `include_visualization: true` on the POST body. The frontend polls every 2 s for up to 180 s (3 minutes); these tests verify the full flow manually.

	### 5.1 Submit query with visualization enabled

	```bash
	http POST $BASE/query \
	"Authorization: Bearer $TOKEN" \
	query="Compare the accuracy scores of LoRA vs full fine-tuning on common NLP benchmarks" \
	include_visualization:=true
	```

	Expected — same response shape as §3.1; returns immediately (text response is NOT delayed by viz):

	```http
	HTTP/1.1 200 OK

	{
	"id": "<uuid>",
	"query": "...",
	"answer": "...",
	"citations": [...],
	"chart_data": null,
	"model_provider": "openai",
	"agent_steps": 5,
	"created_at": "..."
	}
	```

	Capture the query ID:

	```bash
	QUERY_ID=$(http POST $BASE/query \
	"Authorization: Bearer $TOKEN" \
	query="Compare LoRA vs full fine-tuning accuracy" \
	include_visualization:=true \
	\| python -c "import sys,json; print(json.load(sys.stdin)['id'])")
	echo $QUERY_ID
	```

	---

	### 5.2 Poll visualization — pending

	Immediately after submitting (before the viz agent finishes):

	```bash
	http GET $BASE/query/$QUERY_ID/visualization \
	"Authorization: Bearer $TOKEN"
	```

	Expected:

	```http
	HTTP/1.1 200 OK

	{
	"status": "pending",
	"chart_data": null,
	"error": null
	}
	```

	---

	### 5.3 Poll visualization — complete

	After waiting ~10–60 seconds for the VizCodeAgent to finish (complex charts like sunbursts take longer):

	```bash
	http GET $BASE/query/$QUERY_ID/visualization \
	"Authorization: Bearer $TOKEN"
	```

	Expected:

	```http
	HTTP/1.1 200 OK

	{
	"status": "complete",
	"chart_data": {
	"data": [
	{
	"type": "bar",
	"x": ["MNLI", "SST-2", "MRPC"],
	"y": [91.7, 96.2, 90.1],
	"name": "LoRA"
	},
	{
	"type": "bar",
	"x": ["MNLI", "SST-2", "MRPC"],
	"y": [92.1, 96.8, 90.9],
	"name": "Full Fine-Tuning"
	}
	],
	"layout": {
	"title": "LoRA vs Full Fine-Tuning Accuracy",
	"barmode": "group"
	}
	},
	"error": null
	}
	```

	Validation checklist:

	- [ ] `status` is `"complete"`
	- [ ] `chart_data` is a valid Plotly spec with `data` (array) and `layout` (object) keys
	- [ ] `data[*].type` is a recognised Plotly chart type (`bar`, `scatter`, `line`, etc.)

	---

	### 5.4 Poll visualization — query submitted without viz flag returns 404

	```bash
	# Submit without include_visualization
	PLAIN_ID=$(http POST $BASE/query \
	"Authorization: Bearer $TOKEN" \
	query="What is BERT?" \
	\| python -c "import sys,json; print(json.load(sys.stdin)['id'])")

	http GET $BASE/query/$PLAIN_ID/visualization \
	"Authorization: Bearer $TOKEN"
	```

	Expected:

	```http
	HTTP/1.1 404 Not Found

	{
	"detail": "Visualization not found — not requested, not yet started, or expired"
	}
	```

	---

	### 5.5 Poll visualization — no authentication (401)

	```bash
	http GET $BASE/query/$QUERY_ID/visualization
	```

	Expected:

	```http
	HTTP/1.1 401 Unauthorized

	{
	"detail": "Not authenticated"
	}
	```

	---

	### 5.6 Submit query with viz flag — factual query with no data (NO_CHART)

	Some queries produce an answer without numerical data. The viz agent should return `NO_CHART`:

	```bash
	http POST $BASE/query \
	"Authorization: Bearer $TOKEN" \
	query="What is the intuition behind the attention mechanism?" \
	include_visualization:=true
	```

	After the viz agent completes (~5–15 s), poll:

	```bash
	http GET $BASE/query/$QUERY_ID/visualization \
	"Authorization: Bearer $TOKEN"
	```

	Expected — status complete but no chart:

	```json
	{
	"status": "complete",
	"chart_data": null,
	"error": null
	}
	```

	---

	## 6. Settings API

	The settings endpoints let authenticated users read and update application configuration at runtime. Changes are written to the `.env` file inside the backend container and take effect immediately via `get_settings.cache_clear()` — no container restart required.

	> Sensitive keys (API keys) are partially masked in all GET responses (`sk-ab**`). Any value sent to PUT that ends with `**` is treated as a no-op sentinel — the existing key is preserved.

	### 6.1 Get current settings (200)

	```bash
	http GET $BASE/settings \
	"Authorization: Bearer $TOKEN"
	```

	Expected

	```http
	HTTP/1.1 200 OK

	{
	"model_provider": "openai",
	"openai_api_key": "sk-p****",
	"openai_model": "gpt-4o",
	"azure_openai_api_key": "",
	"azure_openai_endpoint": "",
	"azure_openai_deployment": "gpt-4.1-mini-2025-04-14",
	"azure_openai_api_version": "2025-04-14",
	"google_api_key": "",
	"gemini_model": "gemini/gemini-flash-lite-latest",
	"trulens_provider": "openai",
	"trulens_strategy": "async",
	"trulens_sample_rate": 1,
	"trulens_feedback_timeout": 180.0,
	"viz_model_provider": "openai",
	"viz_model_name": "gpt-4o-mini",
	"viz_azure_deployment": "",
	"viz_azure_api_version": ""
	}
	```

	Validation checklist

	- [ ] API keys that are set show as `"<prefix>****"` (never the full value)
	- [ ] API keys that are not set show as `""` (empty string)
	- [ ] `model_provider` matches `MODEL_PROVIDER` in `.env`

	---

	### 6.2 Update provider and model (200)

	```bash
	http PUT $BASE/settings \
	"Authorization: Bearer $TOKEN" \
	model_provider="gemini" \
	google_api_key="AIzaSy..." \
	gemini_model="gemini/gemini-2.0-flash"
	```

	Expected — returns the updated settings with the key masked:

	```http
	HTTP/1.1 200 OK

	{
	"model_provider": "gemini",
	"google_api_key": "AIza****",
	...
	}
	```

	Validation checklist

	- [ ] `model_provider` reflects the new value
	- [ ] `google_api_key` is now masked (not empty)
	- [ ] Immediately submit a query to confirm the new provider is active (no restart needed)

	---

	### 6.3 Masked-key sentinel — existing key is preserved

	Send back the masked placeholder unchanged; the backend should not overwrite the key:

	```bash
	# 1. Capture the current masked value
	MASKED=$(http GET $BASE/settings "Authorization: Bearer $TOKEN" \
	\| python -c "import sys,json; print(json.load(sys.stdin)['openai_api_key'])")

	# 2. PUT with the masked value — key should be unchanged
	http PUT $BASE/settings \
	"Authorization: Bearer $TOKEN" \
	openai_api_key="$MASKED" \
	openai_model="gpt-4o-mini"
	```

	Expected — 200 OK, `openai_api_key` still masked (not cleared), `openai_model` updated.

	---

	### 6.4 Get settings — no authentication (401)

	```bash
	http GET $BASE/settings
	```

	Expected

	```http
	HTTP/1.1 401 Unauthorized

	{
	"detail": "Not authenticated"
	}
	```

	---

	## 7. RBAC Verification

	The vector search is filtered at the SQL level by the user's roles. A `guest` user with no role-document mappings should get "No relevant documents found" from the retriever.

	### 6.1 Register a guest user (default role)

	```bash
	http POST $BASE/auth/register \
	email="guest@example.com" \
	password="guest123"
	# role omitted → defaults to "guest"
	```

	### 6.2 Login as guest

	```bash
	GUEST_TOKEN=$(http -f POST $BASE/auth/login \
	username="guest@example.com" \
	password="guest123" \
	\| python -c "import sys,json; print(json.load(sys.stdin)['access_token'])")
	```

	### 6.3 Query as guest — expects no documents

	```bash
	http POST $BASE/query \
	"Authorization: Bearer $GUEST_TOKEN" \
	query="What is BERT?"
	```

	Expected — agent returns an answer noting no documents were found (HTTP 200, but answer states retriever returned empty context). The `citations` list will be empty (`[]`).

	---

	## 8. Error Summary Table

	\| # \| Method \| Endpoint \| Scenario \| Expected HTTP \|
	\| --- \| ------ \| --------------------------------- \| -------------------------------- \| ------------- \|
	\| 1 \| GET \| `/health` \| normal \| 200 \|
	\| 2 \| POST \| `/auth/register` \| new user \| 201 \|
	\| 3 \| POST \| `/auth/register` \| duplicate email \| 400 \|
	\| 4 \| POST \| `/auth/register` \| invalid email \| 422 \|
	\| 5 \| POST \| `/auth/login` \| correct credentials \| 200 \|
	\| 6 \| POST \| `/auth/login` \| wrong password \| 401 \|
	\| 7 \| GET \| `/auth/me` \| valid token \| 200 \|
	\| 8 \| GET \| `/auth/me` \| no token \| 401 \|
	\| 9 \| GET \| `/auth/me` \| malformed token \| 401 \|
	\| 10 \| POST \| `/query` \| valid query \| 200 \|
	\| 11 \| POST \| `/query` \| empty string \| 422 \|
	\| 12 \| POST \| `/query` \| missing field \| 422 \|
	\| 13 \| POST \| `/query` \| no auth \| 401 \|
	\| 14 \| GET \| `/query/history` \| with auth \| 200 \|
	\| 15 \| GET \| `/query/history` \| no auth \| 401 \|
	\| 16 \| POST \| `/query` \| `include_visualization: true` \| 200 (immediate text response) \|
	\| 17 \| GET \| `/query/{id}/visualization` \| pending (viz in progress) \| 200 `{status:"pending"}` \|
	\| 18 \| GET \| `/query/{id}/visualization` \| complete \| 200 `{status:"complete", chart_data:{...}}` \|
	\| 19 \| GET \| `/query/{id}/visualization` \| viz not requested (no flag) \| 404 \|
	\| 20 \| GET \| `/query/{id}/visualization` \| no auth \| 401 \|
	\| 21 \| GET \| `/settings` \| authenticated \| 200 \|
	\| 22 \| GET \| `/settings` \| no auth \| 401 \|
	\| 23 \| PUT \| `/settings` \| valid payload \| 200 \|
	\| 24 \| PUT \| `/settings` \| masked-key sentinel \| 200 (key unchanged) \|
	\| 25 \| PUT \| `/settings` \| no auth \| 401 \|

	---

	## 9. Verbose Mode & Inspecting Headers

	Add `-v` to see full request/response headers, useful for debugging CORS or auth issues:

	```bash
	http -v GET $BASE/auth/me \
	"Authorization: Bearer $TOKEN"
	```

	Check only response headers (no body):

	```bash
	http -h POST $BASE/query \
	"Authorization: Bearer $TOKEN" \
	query="test"
	```

	---

	## 10. Interactive API Docs

	FastAPI serves two interactive UIs automatically:

	\| UI \| URL \|
	\| ---------------- \| ---------------------------------- \|
	\| Swagger UI \| http://localhost:8000/docs \|
	\| ReDoc \| http://localhost:8000/redoc \|
	\| Raw OpenAPI JSON \| http://localhost:8000/openapi.json \|

	The saved snapshot is at `docs/openapi.json`.

	---

	## 11. TruLens Evaluation Verification

	After running a query, TruLens scores are persisted asynchronously to the `evaluation_results` table. Check them directly:

	```bash
	docker exec agentic-rag-db-1 psql -U postgres -d agentic_rag -c "
	SELECT
	ql.query_text,
	er.relevance_score,
	er.groundedness_score,
	er.answer_relevance_score,
	er.created_at
	FROM evaluation_results er
	JOIN query_logs ql ON ql.id = er.query_log_id
	ORDER BY er.created_at DESC
	LIMIT 5;"
	```

	Target scores (from SLA):

	\| Metric \| Target \|
	\| ----------------- \| ------ \|
	\| Context Relevance \| > 0.85 \|
	\| Groundedness \| > 0.90 \|
	\| Answer Relevance \| > 0.85 \|

	> Scores are written in a background thread after the HTTP response. The worker waits up to `TRULENS_FEEDBACK_TIMEOUT` seconds (default 180) for TruLens to finish all three judge calls via `retrieve_feedback_results` before persisting to `evaluation_results`. Slow LLM proxies may need a higher timeout in `.env`.

	---

	## 12. Performance Targets

	The pipeline SLA from the spec:

	\| Stage \| Target \|
	\| ----------------------------------- \| -------- \|
	\| Full pipeline (end-to-end) \| < 6 s \|
	\| LLM generation \| < 3 s \|
	\| CrossEncoder reranking (20 pairs) \| < 500 ms \|
	\| pgvector HNSW search (100k vectors) \| < 50 ms \|

	> Note: The CrossEncoder model (`cross-encoder/ms-marco-MiniLM-L-6-v2`, ~85 MB) is loaded once at first request and cached as a process singleton. First-query latency on cold start (~30 s on CPU) is expected; subsequent queries meet the SLA.

	Measure end-to-end latency with HTTPie's `--print=h` and the response `Date` header, or use `time`:

	```bash
	time http POST $BASE/query \
	"Authorization: Bearer $TOKEN" \
	query="What is retrieval-augmented generation?"
	```