Spaces:
Running
Running
docs: sharpen zero-hallucination claim, explain Mistral-7B row
Browse filesReframe the headline metric to "on all API provider configurations"
and add a sentence explaining the self-hosted Mistral-7B benchmark
as a deliberate model-size floor for agentic retrieval.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
README.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
| 2 |
|
| 3 |

|
| 4 |
|
| 5 |
-
Agentic knowledge retrieval system with evaluation benchmark. Custom orchestration pipeline + LangChain baseline, evaluated on the same 27-question golden dataset across 3 providers (OpenAI, Anthropic, self-hosted vLLM on Modal). Zero hallucinated citations
|
| 6 |
|
| 7 |
`288 tests` 路 `3 providers` 路 `LangChain comparison` 路 `K8s + Terraform` 路 `CI`
|
| 8 |
|
|
|
|
| 2 |
|
| 3 |

|
| 4 |
|
| 5 |
+
Agentic knowledge retrieval system with evaluation benchmark. Custom orchestration pipeline + LangChain baseline, evaluated on the same 27-question golden dataset across 3 providers (OpenAI, Anthropic, self-hosted vLLM on Modal). Zero hallucinated citations on all API provider configurations. The separate self-hosted Mistral-7B benchmark is included to show the practical model-size floor where agentic retrieval starts to break down.
|
| 6 |
|
| 7 |
`288 tests` 路 `3 providers` 路 `LangChain comparison` 路 `K8s + Terraform` 路 `CI`
|
| 8 |
|