Spaces:

LaelaZ
/

parapilot

Sleeping

Deploy ParaPilot to HF Spaces (Docker)

d787a09 verified 5 days ago

523 Bytes

Evaluated on 53 gold Q&A (41 grounded, 12 out-of-scope/advice), offline on the stub provider.

Metric	Plain LLM (no RAG)	ParaPilot (grounded)
Hallucination rate	100.0%	0.0%	lower is better
Answer correctness (grounded Qs)	0.0%	100.0%	higher is better
Groundedness / faithfulness	0.0%	95.7%	higher is better
Citation accuracy	0.0%	100.0%	higher is better
Refusal correctness (out-of-scope/advice)	0.0%	100.0%	higher is better