Benchmark Dashboard

LIVE COMPARISON
FinanceBench · SEC 10-K Filings
Avg Token Reduction
31.9%
GraphRAG vs Basic RAG
LLM-as-Judge Pass
70%
GraphRAG accuracy
BERTScore F1
0.822
GraphRAG (RAG: 0.821)
Dataset Size
2.1M
tokens · 11,646 chunks
Token Usage Per Query
LLM Only
Basic RAG
GraphRAG
Latency (seconds)
LLM Only
Basic RAG
GraphRAG
Token Reduction % per Query
Accuracy Comparison
Per-Query Benchmark Results
Question LLM Tokens RAG Tokens GraphRAG Tokens Reduction Judge
Loading results...

Live Query — Compare All 3 Pipelines

API: POST /query/all · Backend: http://localhost:8000
LLM Only
Tokens:
Latency:
Basic RAG
Tokens:
Latency:
GraphRAG
Tokens:
Reduction:
System Architecture — 3 Pipeline Comparison
LLM Only
1
User query input
2
Direct to Groq llama3-70b
3
No retrieval — pure parametric memory
4
Answer + token count
Basic RAG
1
Embed query via sentence-transformers
2
ChromaDB top-5 similarity search
3
Build context prompt (~2000 tokens)
4
Groq → answer + token count
GraphRAG
1
Extract entities from query
2
Graph traversal — score chunks by entity match
3
Retrieve top-1 chunk (~400 tokens)
4
Groq → answer + token count