agentic-rag / docs /Backend Testing Guide.md
vksepm
updated docs
992873f
|
Raw
History Blame Contribute Delete
21.7 kB

Backend API Testing Guide

End-to-end validation of the Agentic-RAG backend using HTTPie (http command).

For ad-hoc Azure OpenAI and TruLens troubleshooting scripts run inside the Docker backend, see test-scripts-troubleshooting.md.


Prerequisites

Install HTTPie

# macOS
brew install httpie

# Linux / WSL
pip install httpie

# Windows (PowerShell)
winget install httpie.httpie

Verify: http --version β†’ should print 3.x.x.

Base URL

All examples use the backend running at http://localhost:8000. Set a shell variable for convenience:

BASE=http://localhost:8000/api/v1

Test accounts (pre-seeded)

Email Password Role
researcher@example.com researcher123 researcher
test_researcher@example.com wrong123 researcher

Researcher role has access to 55 documents / 5 000 chunks in the vector store.


1. Health Check

http GET $BASE/health

Expected

HTTP/1.1 200 OK
Content-Type: application/json

{
    "status": "ok"
}

2. Authentication

2.1 Register β€” success (201)

http POST $BASE/auth/register \
    email="newuser@example.com" \
    password="secret123" \
    display_name="New User" \
    role="researcher"

Expected

HTTP/1.1 201 Created
Content-Type: application/json

{
    "id": "<uuid>",
    "email": "newuser@example.com",
    "display_name": "New User",
    "is_active": true,
    "created_at": "2026-03-20T15:00:00Z"
}

2.2 Register β€” duplicate email (400)

http POST $BASE/auth/register \
    email="researcher@example.com" \
    password="anything"

Expected

HTTP/1.1 400 Bad Request

{
    "detail": "Email already registered"
}

2.3 Register β€” invalid email format (422)

http POST $BASE/auth/register \
    email="not-an-email" \
    password="secret123"

Expected

HTTP/1.1 422 Unprocessable Entity

{
    "detail": [
        {
            "type": "value_error",
            "loc": ["body", "email"],
            "msg": "value is not a valid email address"
        }
    ]
}

2.4 Login β€” correct credentials (200)

Login uses application/x-www-form-urlencoded (OAuth2 password flow), so pass -f.

http -f POST $BASE/auth/login \
    username="researcher@example.com" \
    password="researcher123"

Expected

HTTP/1.1 200 OK

{
    "access_token": "eyJhbGci...",
    "token_type": "bearer"
}

Capture token for subsequent requests

TOKEN=$(http -f POST $BASE/auth/login \
    username="researcher@example.com" \
    password="researcher123" \
    | python -c "import sys,json; print(json.load(sys.stdin)['access_token'])")
echo $TOKEN

2.5 Login β€” wrong password (401)

http -f POST $BASE/auth/login \
    username="researcher@example.com" \
    password="wrongpassword"

Expected

HTTP/1.1 401 Unauthorized

{
    "detail": "Incorrect email or password"
}

2.6 Login β€” unknown user (401)

http -f POST $BASE/auth/login \
    username="nobody@example.com" \
    password="anything"

Expected β€” same 401 response (no user enumeration).


2.7 Get Current User β€” valid token (200)

http GET $BASE/auth/me \
    "Authorization: Bearer $TOKEN"

Expected

HTTP/1.1 200 OK

{
    "id": "<uuid>",
    "email": "researcher@example.com",
    "display_name": "Test Researcher",
    "is_active": true,
    "created_at": "..."
}

2.8 Get Current User β€” no token (401)

http GET $BASE/auth/me

Expected

HTTP/1.1 401 Unauthorized

{
    "detail": "Not authenticated"
}

2.9 Get Current User β€” malformed token (401)

http GET $BASE/auth/me \
    "Authorization: Bearer not.a.valid.jwt"

Expected

HTTP/1.1 401 Unauthorized

{
    "detail": "Could not validate credentials"
}

3. Query

All query endpoints require a valid JWT. Run the token capture from Β§2.4 first.

3.1 Submit Query β€” success (200)

http POST $BASE/query \
    "Authorization: Bearer $TOKEN" \
    query="What are the key differences between BERT and GPT architectures?"

Expected (~60–120 s β€” the agent embeds, searches pgvector, reranks with CrossEncoder, then calls the LLM)

HTTP/1.1 200 OK

{
    "id": "<uuid>",
    "query": "What are the key differences between BERT and GPT architectures?",
    "answer": "BERT is an encoder-only Transformer ... GPT is decoder-only ...",
    "citations": [
        {
            "index": 1,
            "title": "Attention Is All You Need",
            "source_url": "https://arxiv.org/abs/1706.03762",
            "full_citation": "Vaswani et al. arXiv:1706.03762, 2017."
        }
    ],
    "chart_data": null,
    "model_provider": "openai",
    "agent_steps": 5,
    "created_at": "..."
}

Validation checklist

  • answer is non-empty and references the query topic
  • citations list has β‰₯ 1 entry with valid source_url starting with https://arxiv.org/
  • agent_steps is 1–5
  • model_provider matches MODEL_PROVIDER in .env

3.2 Submit Query β€” numerical data triggers chart_data

Ask a question whose answer contains benchmark numbers so the agent populates chart_data:

http POST $BASE/query \
    "Authorization: Bearer $TOKEN" \
    query="Compare the accuracy scores of LoRA vs full fine-tuning on common NLP benchmarks"

Expected β€” chart_data is a Plotly spec (not null):

{
    "chart_data": {
        "data": [
            {
                "type": "bar",
                "x": ["MNLI", "SST-2", "MRPC"],
                "y": [91.7, 96.2, 90.1]
            }
        ],
        "layout": {
            "title": "LoRA vs Full Fine-Tuning Benchmark Scores"
        }
    }
}

3.3 Submit Query β€” empty query string (422)

http POST $BASE/query \
    "Authorization: Bearer $TOKEN" \
    query="   "

Expected

HTTP/1.1 422 Unprocessable Entity

{
    "detail": "Query must not be empty"
}

3.4 Submit Query β€” missing query field (422)

http POST $BASE/query \
    "Authorization: Bearer $TOKEN" \
    Content-Type:application/json \
    <<< '{}'

Expected

HTTP/1.1 422 Unprocessable Entity

{
    "detail": [
        {
            "type": "missing",
            "loc": ["body", "query"],
            "msg": "Field required"
        }
    ]
}

3.5 Submit Query β€” no authentication (401)

http POST $BASE/query \
    query="What is attention mechanism?"

Expected

HTTP/1.1 401 Unauthorized

{
    "detail": "Not authenticated"
}

4. Query History

4.1 List history β€” returns entries in reverse-chronological order (200)

http GET $BASE/query/history \
    "Authorization: Bearer $TOKEN"

Expected

HTTP/1.1 200 OK

[
    {
        "id": "<uuid>",
        "query_text": "What are the key differences between BERT and GPT architectures?",
        "response_text": "BERT is an encoder-only ...",
        "model_provider": "openai",
        "agent_steps": 5,
        "created_at": "..."
    }
]

4.2 Pagination β€” limit and offset

# First page: 2 items
http GET "$BASE/query/history?limit=2&offset=0" \
    "Authorization: Bearer $TOKEN"

# Second page: next 2
http GET "$BASE/query/history?limit=2&offset=2" \
    "Authorization: Bearer $TOKEN"

Validation checklist

  • limit controls list length (≀ N items returned)
  • offset skips earlier entries
  • Items are ordered newest-first

4.3 New user sees empty history

# Register a fresh user
http POST $BASE/auth/register \
    email="freshuser@example.com" \
    password="pass1234"

# Login
FRESH_TOKEN=$(http -f POST $BASE/auth/login \
    username="freshuser@example.com" \
    password="pass1234" \
    | python -c "import sys,json; print(json.load(sys.stdin)['access_token'])")

# History should be empty
http GET $BASE/query/history \
    "Authorization: Bearer $FRESH_TOKEN"

Expected β€” [] (empty array, HTTP 200).


5. On-Demand Visualization

The visualization endpoint is driven by include_visualization: true on the POST body. The frontend polls every 2 s for up to 180 s (3 minutes); these tests verify the full flow manually.

5.1 Submit query with visualization enabled

http POST $BASE/query \
    "Authorization: Bearer $TOKEN" \
    query="Compare the accuracy scores of LoRA vs full fine-tuning on common NLP benchmarks" \
    include_visualization:=true

Expected β€” same response shape as Β§3.1; returns immediately (text response is NOT delayed by viz):

HTTP/1.1 200 OK

{
    "id": "<uuid>",
    "query": "...",
    "answer": "...",
    "citations": [...],
    "chart_data": null,
    "model_provider": "openai",
    "agent_steps": 5,
    "created_at": "..."
}

Capture the query ID:

QUERY_ID=$(http POST $BASE/query \
    "Authorization: Bearer $TOKEN" \
    query="Compare LoRA vs full fine-tuning accuracy" \
    include_visualization:=true \
    | python -c "import sys,json; print(json.load(sys.stdin)['id'])")
echo $QUERY_ID

5.2 Poll visualization β€” pending

Immediately after submitting (before the viz agent finishes):

http GET $BASE/query/$QUERY_ID/visualization \
    "Authorization: Bearer $TOKEN"

Expected:

HTTP/1.1 200 OK

{
    "status": "pending",
    "chart_data": null,
    "error": null
}

5.3 Poll visualization β€” complete

After waiting ~10–60 seconds for the VizCodeAgent to finish (complex charts like sunbursts take longer):

http GET $BASE/query/$QUERY_ID/visualization \
    "Authorization: Bearer $TOKEN"

Expected:

HTTP/1.1 200 OK

{
    "status": "complete",
    "chart_data": {
        "data": [
            {
                "type": "bar",
                "x": ["MNLI", "SST-2", "MRPC"],
                "y": [91.7, 96.2, 90.1],
                "name": "LoRA"
            },
            {
                "type": "bar",
                "x": ["MNLI", "SST-2", "MRPC"],
                "y": [92.1, 96.8, 90.9],
                "name": "Full Fine-Tuning"
            }
        ],
        "layout": {
            "title": "LoRA vs Full Fine-Tuning Accuracy",
            "barmode": "group"
        }
    },
    "error": null
}

Validation checklist:

  • status is "complete"
  • chart_data is a valid Plotly spec with data (array) and layout (object) keys
  • data[*].type is a recognised Plotly chart type (bar, scatter, line, etc.)

5.4 Poll visualization β€” query submitted without viz flag returns 404

# Submit without include_visualization
PLAIN_ID=$(http POST $BASE/query \
    "Authorization: Bearer $TOKEN" \
    query="What is BERT?" \
    | python -c "import sys,json; print(json.load(sys.stdin)['id'])")

http GET $BASE/query/$PLAIN_ID/visualization \
    "Authorization: Bearer $TOKEN"

Expected:

HTTP/1.1 404 Not Found

{
    "detail": "Visualization not found β€” not requested, not yet started, or expired"
}

5.5 Poll visualization β€” no authentication (401)

http GET $BASE/query/$QUERY_ID/visualization

Expected:

HTTP/1.1 401 Unauthorized

{
    "detail": "Not authenticated"
}

5.6 Submit query with viz flag β€” factual query with no data (NO_CHART)

Some queries produce an answer without numerical data. The viz agent should return NO_CHART:

http POST $BASE/query \
    "Authorization: Bearer $TOKEN" \
    query="What is the intuition behind the attention mechanism?" \
    include_visualization:=true

After the viz agent completes (~5–15 s), poll:

http GET $BASE/query/$QUERY_ID/visualization \
    "Authorization: Bearer $TOKEN"

Expected β€” status complete but no chart:

{
    "status": "complete",
    "chart_data": null,
    "error": null
}

6. Settings API

The settings endpoints let authenticated users read and update application configuration at runtime. Changes are written to the .env file inside the backend container and take effect immediately via get_settings.cache_clear() β€” no container restart required.

Sensitive keys (API keys) are partially masked in all GET responses (sk-ab****). Any value sent to PUT that ends with **** is treated as a no-op sentinel β€” the existing key is preserved.

6.1 Get current settings (200)

http GET $BASE/settings \
    "Authorization: Bearer $TOKEN"

Expected

HTTP/1.1 200 OK

{
    "model_provider": "openai",
    "openai_api_key": "sk-p****",
    "openai_model": "gpt-4o",
    "azure_openai_api_key": "",
    "azure_openai_endpoint": "",
    "azure_openai_deployment": "gpt-4.1-mini-2025-04-14",
    "azure_openai_api_version": "2025-04-14",
    "google_api_key": "",
    "gemini_model": "gemini/gemini-flash-lite-latest",
    "trulens_provider": "openai",
    "trulens_strategy": "async",
    "trulens_sample_rate": 1,
    "trulens_feedback_timeout": 180.0,
    "viz_model_provider": "openai",
    "viz_model_name": "gpt-4o-mini",
    "viz_azure_deployment": "",
    "viz_azure_api_version": ""
}

Validation checklist

  • API keys that are set show as "<prefix>****" (never the full value)
  • API keys that are not set show as "" (empty string)
  • model_provider matches MODEL_PROVIDER in .env

6.2 Update provider and model (200)

http PUT $BASE/settings \
    "Authorization: Bearer $TOKEN" \
    model_provider="gemini" \
    google_api_key="AIzaSy..." \
    gemini_model="gemini/gemini-2.0-flash"

Expected β€” returns the updated settings with the key masked:

HTTP/1.1 200 OK

{
    "model_provider": "gemini",
    "google_api_key": "AIza****",
    ...
}

Validation checklist

  • model_provider reflects the new value
  • google_api_key is now masked (not empty)
  • Immediately submit a query to confirm the new provider is active (no restart needed)

6.3 Masked-key sentinel β€” existing key is preserved

Send back the masked placeholder unchanged; the backend should not overwrite the key:

# 1. Capture the current masked value
MASKED=$(http GET $BASE/settings "Authorization: Bearer $TOKEN" \
    | python -c "import sys,json; print(json.load(sys.stdin)['openai_api_key'])")

# 2. PUT with the masked value β€” key should be unchanged
http PUT $BASE/settings \
    "Authorization: Bearer $TOKEN" \
    openai_api_key="$MASKED" \
    openai_model="gpt-4o-mini"

Expected β€” 200 OK, openai_api_key still masked (not cleared), openai_model updated.


6.4 Get settings β€” no authentication (401)

http GET $BASE/settings

Expected

HTTP/1.1 401 Unauthorized

{
    "detail": "Not authenticated"
}

7. RBAC Verification

The vector search is filtered at the SQL level by the user's roles. A guest user with no role-document mappings should get "No relevant documents found" from the retriever.

6.1 Register a guest user (default role)

http POST $BASE/auth/register \
    email="guest@example.com" \
    password="guest123"
    # role omitted β†’ defaults to "guest"

6.2 Login as guest

GUEST_TOKEN=$(http -f POST $BASE/auth/login \
    username="guest@example.com" \
    password="guest123" \
    | python -c "import sys,json; print(json.load(sys.stdin)['access_token'])")

6.3 Query as guest β€” expects no documents

http POST $BASE/query \
    "Authorization: Bearer $GUEST_TOKEN" \
    query="What is BERT?"

Expected β€” agent returns an answer noting no documents were found (HTTP 200, but answer states retriever returned empty context). The citations list will be empty ([]).


8. Error Summary Table

# Method Endpoint Scenario Expected HTTP
1 GET /health normal 200
2 POST /auth/register new user 201
3 POST /auth/register duplicate email 400
4 POST /auth/register invalid email 422
5 POST /auth/login correct credentials 200
6 POST /auth/login wrong password 401
7 GET /auth/me valid token 200
8 GET /auth/me no token 401
9 GET /auth/me malformed token 401
10 POST /query valid query 200
11 POST /query empty string 422
12 POST /query missing field 422
13 POST /query no auth 401
14 GET /query/history with auth 200
15 GET /query/history no auth 401
16 POST /query include_visualization: true 200 (immediate text response)
17 GET /query/{id}/visualization pending (viz in progress) 200 {status:"pending"}
18 GET /query/{id}/visualization complete 200 {status:"complete", chart_data:{...}}
19 GET /query/{id}/visualization viz not requested (no flag) 404
20 GET /query/{id}/visualization no auth 401
21 GET /settings authenticated 200
22 GET /settings no auth 401
23 PUT /settings valid payload 200
24 PUT /settings masked-key sentinel 200 (key unchanged)
25 PUT /settings no auth 401

9. Verbose Mode & Inspecting Headers

Add -v to see full request/response headers, useful for debugging CORS or auth issues:

http -v GET $BASE/auth/me \
    "Authorization: Bearer $TOKEN"

Check only response headers (no body):

http -h POST $BASE/query \
    "Authorization: Bearer $TOKEN" \
    query="test"

10. Interactive API Docs

FastAPI serves two interactive UIs automatically:

The saved snapshot is at docs/openapi.json.


11. TruLens Evaluation Verification

After running a query, TruLens scores are persisted asynchronously to the evaluation_results table. Check them directly:

docker exec agentic-rag-db-1 psql -U postgres -d agentic_rag -c "
SELECT
    ql.query_text,
    er.relevance_score,
    er.groundedness_score,
    er.answer_relevance_score,
    er.created_at
FROM evaluation_results er
JOIN query_logs ql ON ql.id = er.query_log_id
ORDER BY er.created_at DESC
LIMIT 5;"

Target scores (from SLA):

Metric Target
Context Relevance > 0.85
Groundedness > 0.90
Answer Relevance > 0.85

Scores are written in a background thread after the HTTP response. The worker waits up to TRULENS_FEEDBACK_TIMEOUT seconds (default 180) for TruLens to finish all three judge calls via retrieve_feedback_results before persisting to evaluation_results. Slow LLM proxies may need a higher timeout in .env.


12. Performance Targets

The pipeline SLA from the spec:

Stage Target
Full pipeline (end-to-end) < 6 s
LLM generation < 3 s
CrossEncoder reranking (20 pairs) < 500 ms
pgvector HNSW search (100k vectors) < 50 ms

Note: The CrossEncoder model (cross-encoder/ms-marco-MiniLM-L-6-v2, 85 MB) is loaded once at first request and cached as a process singleton. First-query latency on cold start (30 s on CPU) is expected; subsequent queries meet the SLA.

Measure end-to-end latency with HTTPie's --print=h and the response Date header, or use time:

time http POST $BASE/query \
    "Authorization: Bearer $TOKEN" \
    query="What is retrieval-augmented generation?"