Commit History

Fix sentinel edge cases: hallucination combo guard + UI formatting
8d335e4

mbochniak01 Claude Sonnet 4.6 commited on

Replace ad-hoc refusal regexes with NOT IN DOCUMENTS sentinel
0ad5e39

mbochniak01 Claude Sonnet 4.6 commited on

Fix refusal detector missing 'not able to' and 'outside scope' phrases
7ee90da

mbochniak01 Claude Sonnet 4.6 commited on

Replace HHEM with sentence-level NLI, add claim decomposition and drift detection
ffbf46f

mbochniak01 Claude Sonnet 4.6 commited on

Fix compat, bugs, and types; expand retail KB
e181667

mbochniak01 Claude Sonnet 4.6 commited on

Switch faithfulness to text_pair encoding, promote score logging to INFO
29f3273

mbochniak01 Claude Sonnet 4.6 commited on

Fix Vectara label check and input format
5935cf6

mbochniak01 Claude Sonnet 4.6 commited on

Load T5-small tokenizer for Vectara HHEM v2
14d263b

mbochniak01 Claude Sonnet 4.6 commited on

Use T5Tokenizer directly for Vectara HHEM v2
69c362c

mbochniak01 Claude Sonnet 4.6 commited on

Fix Vectara pipeline: explicitly load tokenizer before pipeline init
86cfc1b

mbochniak01 Claude Sonnet 4.6 commited on

Load Vectara model via transformers pipeline, not CrossEncoder
a42a9e0

mbochniak01 Claude Sonnet 4.6 commited on

Add trust_remote_code=True for Vectara hallucination model
cbb4147

mbochniak01 Claude Sonnet 4.6 commited on

Switch faithfulness grader to Vectara hallucination evaluation model
eb90c62

mbochniak01 Claude Sonnet 4.6 commited on

Faithfulness: mean sentence scoring, strip chunk title prefix, lower threshold to 0.35
cd30e2d

below-threshold commited on

Fix faithfulness: score per chunk, take max entailment
7b3dadd

below-threshold commited on

Replace Anthropic with free-tier stack
ebb06ed

below-threshold commited on

Add full RAG evaluation pipeline with L1 metrics and UI
ebe934f

mbochniak01 commited on