Switch faithfulness to text_pair encoding, promote score logging to INFO 29f3273 Running mbochniak01 Claude Sonnet 4.6 commited on 1 day ago
Add telemetry layer: in-memory counters + HF Dataset persistence c79d967 mbochniak01 Claude Sonnet 4.6 commited on 1 day ago
Fix Vectara label check and input format 5935cf6 mbochniak01 Claude Sonnet 4.6 commited on 3 days ago
Add /refresh-cache endpoint, bi-encoder comparison, eval results, Ollama/Prometheus notes e77a2f2 mbochniak01 Claude Sonnet 4.6 commited on 3 days ago
Add joke short-circuit and reset attempt handler 27156ca mbochniak01 Claude Sonnet 4.6 commited on 3 days ago
Load T5-small tokenizer for Vectara HHEM v2 14d263b mbochniak01 Claude Sonnet 4.6 commited on 4 days ago
Add sentencepiece dependency for T5Tokenizer 7a72ab0 mbochniak01 Claude Sonnet 4.6 commited on 4 days ago
Use T5Tokenizer directly for Vectara HHEM v2 69c362c mbochniak01 Claude Sonnet 4.6 commited on 4 days ago
Fix Vectara pipeline: explicitly load tokenizer before pipeline init 86cfc1b mbochniak01 Claude Sonnet 4.6 commited on 4 days ago
Pin transformers<4.46.0 to fix Vectara HHEM v2 compatibility b2eeefb mbochniak01 Claude Sonnet 4.6 commited on 4 days ago
Load Vectara model via transformers pipeline, not CrossEncoder a42a9e0 mbochniak01 Claude Sonnet 4.6 commited on 4 days ago
Speed up Docker builds: .dockerignore + merge model download layers 1ea03d4 mbochniak01 Claude Sonnet 4.6 commited on 4 days ago
Add trust_remote_code=True for Vectara hallucination model cbb4147 mbochniak01 Claude Sonnet 4.6 commited on 4 days ago
Switch faithfulness grader to Vectara hallucination evaluation model eb90c62 mbochniak01 Claude Sonnet 4.6 commited on 4 days ago
Exclude pinned glossary doc from faithfulness grading context cd174a4 below-threshold commited on 4 days ago
Replace enforce_terminology with pinned glossary doc in RAG context 76be5a0 below-threshold commited on 4 days ago
Add enforce_terminology: deterministic post-processing corrective gate 54a5940 below-threshold commited on 4 days ago
Faithfulness: mean sentence scoring, strip chunk title prefix, lower threshold to 0.35 cd30e2d below-threshold commited on 4 days ago
Load all KB formats merged β drop CSV directly, no conversion needed 2a47292 below-threshold commited on 7 days ago
Add Kaggle drug CSV β features.yaml conversion script 8103f33 below-threshold commited on 7 days ago
UX: welcome message with example questions, specific failure details in verdict f7a25db below-threshold commited on 7 days ago
Fix faithfulness: score per chunk, take max entailment 7b3dadd below-threshold commited on 7 days ago
Update ARCHITECTURE.md and README.md to reflect client library and test suite 1f6dac5 mbochniak01 commited on 7 days ago
Add typed client library, unit + integration tests, mypy, ruff, NOTES.md 10aced5 mbochniak01 commited on 7 days ago
Add Makefile, HTML eval report generator, gitignore for reports 3c949de mbochniak01 commited on 7 days ago