Nomearod Claude Opus 4.7 (1M context) commited on
Commit
0e96cb9
·
1 Parent(s): fcfd067

docs(readme): correct test count 444 → 443

Browse files

Reconcile README test-count claim with actual `pytest --collect-only`
output (443 tests). Updates the four occurrences in the badge line,
production-engineering bullet, Testing section, and the comparison
table footer row.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -6,7 +6,7 @@
6
 
7
  Agentic knowledge retrieval system with evaluation benchmark. Custom orchestration pipeline + LangChain baseline, evaluated on matched golden datasets across 3 providers (OpenAI, Anthropic, self-hosted vLLM on Modal) and two corpora (FastAPI + Kubernetes). Zero hallucinated citations on all API provider configurations. The separate self-hosted Mistral-7B benchmark is included to show the practical model-size floor where agentic retrieval starts to break down.
8
 
9
- `444 tests` · `3 providers` · `2 corpora` · `LangChain comparison` · `K8s + Terraform` · `CI`
10
 
11
  ## Benchmark Results
12
 
@@ -240,7 +240,7 @@ security:
240
  - **MLOps:** Provider comparison benchmark (API vs self-hosted, real measured data)
241
  - **Security — detection & redaction**: Two-tier prompt injection detection (heuristic regex + DeBERTa classifier), PII redaction on retrieved context, output validation gate (PII leakage, URL hallucination, blocklist)
242
  - **Security — audit & compliance**: Append-only JSONL audit trail, HMAC-SHA256 IP hashing (GDPR-aligned), log rotation, config-driven security with Literal-constrained enums
243
- - **Production engineering**: FastAPI, Docker, CI/CD, structured logging, rate limiting, SSE streaming, conversation sessions, 444 deterministic tests with mock providers
244
 
245
  <details><summary>API Reference</summary>
246
 
@@ -302,7 +302,7 @@ The golden dataset contains 27 hand-crafted FastAPI questions (19 retrieval · 3
302
  ## Testing
303
 
304
  ```bash
305
- make test # 444 deterministic tests, no API keys needed
306
  make lint # ruff + mypy
307
  ```
308
 
@@ -325,4 +325,4 @@ See [DECISIONS.md](DECISIONS.md) for rationale on building from primitives, RRF
325
  | **PII redaction** | None | None | Regex + optional NER |
326
  | **Output validation** | None | None | PII leakage + URL + blocklist |
327
  | **Audit logging** | None | None | JSONL, HMAC-hashed IPs |
328
- | Tests | 97 | 205 | 288 |
 
6
 
7
  Agentic knowledge retrieval system with evaluation benchmark. Custom orchestration pipeline + LangChain baseline, evaluated on matched golden datasets across 3 providers (OpenAI, Anthropic, self-hosted vLLM on Modal) and two corpora (FastAPI + Kubernetes). Zero hallucinated citations on all API provider configurations. The separate self-hosted Mistral-7B benchmark is included to show the practical model-size floor where agentic retrieval starts to break down.
8
 
9
+ `443 tests` · `3 providers` · `2 corpora` · `LangChain comparison` · `K8s + Terraform` · `CI`
10
 
11
  ## Benchmark Results
12
 
 
240
  - **MLOps:** Provider comparison benchmark (API vs self-hosted, real measured data)
241
  - **Security — detection & redaction**: Two-tier prompt injection detection (heuristic regex + DeBERTa classifier), PII redaction on retrieved context, output validation gate (PII leakage, URL hallucination, blocklist)
242
  - **Security — audit & compliance**: Append-only JSONL audit trail, HMAC-SHA256 IP hashing (GDPR-aligned), log rotation, config-driven security with Literal-constrained enums
243
+ - **Production engineering**: FastAPI, Docker, CI/CD, structured logging, rate limiting, SSE streaming, conversation sessions, 443 deterministic tests with mock providers
244
 
245
  <details><summary>API Reference</summary>
246
 
 
302
  ## Testing
303
 
304
  ```bash
305
+ make test # 443 deterministic tests, no API keys needed
306
  make lint # ruff + mypy
307
  ```
308
 
 
325
  | **PII redaction** | None | None | Regex + optional NER |
326
  | **Output validation** | None | None | PII leakage + URL + blocklist |
327
  | **Audit logging** | None | None | JSONL, HMAC-hashed IPs |
328
+ | Tests | 97 | 205 | 443 |