Commit History

Switch faithfulness to text_pair encoding, promote score logging to INFO
29f3273
Running

mbochniak01 Claude Sonnet 4.6 commited on

Add telemetry layer: in-memory counters + HF Dataset persistence
c79d967

mbochniak01 Claude Sonnet 4.6 commited on

Fix Vectara label check and input format
5935cf6

mbochniak01 Claude Sonnet 4.6 commited on

Add /refresh-cache endpoint, bi-encoder comparison, eval results, Ollama/Prometheus notes
e77a2f2

mbochniak01 Claude Sonnet 4.6 commited on

Add joke short-circuit and reset attempt handler
27156ca

mbochniak01 Claude Sonnet 4.6 commited on

Load T5-small tokenizer for Vectara HHEM v2
14d263b

mbochniak01 Claude Sonnet 4.6 commited on

Add sentencepiece dependency for T5Tokenizer
7a72ab0

mbochniak01 Claude Sonnet 4.6 commited on

Use T5Tokenizer directly for Vectara HHEM v2
69c362c

mbochniak01 Claude Sonnet 4.6 commited on

Fix Vectara pipeline: explicitly load tokenizer before pipeline init
86cfc1b

mbochniak01 Claude Sonnet 4.6 commited on

Pin transformers<4.46.0 to fix Vectara HHEM v2 compatibility
b2eeefb

mbochniak01 Claude Sonnet 4.6 commited on

Load Vectara model via transformers pipeline, not CrossEncoder
a42a9e0

mbochniak01 Claude Sonnet 4.6 commited on

Speed up Docker builds: .dockerignore + merge model download layers
1ea03d4

mbochniak01 Claude Sonnet 4.6 commited on

Add trust_remote_code=True for Vectara hallucination model
cbb4147

mbochniak01 Claude Sonnet 4.6 commited on

Switch faithfulness grader to Vectara hallucination evaluation model
eb90c62

mbochniak01 Claude Sonnet 4.6 commited on

Exclude pinned glossary doc from faithfulness grading context
cd174a4

below-threshold commited on

Replace enforce_terminology with pinned glossary doc in RAG context
76be5a0

below-threshold commited on

Add enforce_terminology: deterministic post-processing corrective gate
54a5940

below-threshold commited on

Faithfulness: mean sentence scoring, strip chunk title prefix, lower threshold to 0.35
cd30e2d

below-threshold commited on

Pre-build KB indexes at startup, not on first query
aef9f0f

below-threshold commited on

Load all KB formats merged β€” drop CSV directly, no conversion needed
2a47292

below-threshold commited on

Add 15 drug profiles to pharma KB from Kaggle dataset
2a3badd

below-threshold commited on

Add Kaggle drug CSV β†’ features.yaml conversion script
8103f33

below-threshold commited on

UX: welcome message with example questions, specific failure details in verdict
f7a25db

below-threshold commited on

Inject client terminology into system prompt
99649f6

below-threshold commited on

Fix faithfulness: score per chunk, take max entailment
7b3dadd

below-threshold commited on

Switch generation model to Llama-3-8B-Instruct
6e6032f

below-threshold commited on

Replace Anthropic with free-tier stack
ebb06ed

below-threshold commited on

Add .env and .DS_Store to gitignore
7ae3ff4

mbochniak01 commited on

Update NOTES.md
8cdbafd

mbochniak01 commited on

Update ARCHITECTURE.md and README.md to reflect client library and test suite
1f6dac5

mbochniak01 commited on

Add typed client library, unit + integration tests, mypy, ruff, NOTES.md
10aced5

mbochniak01 commited on

Add Makefile, HTML eval report generator, gitignore for reports
3c949de

mbochniak01 commited on

Add L2 batch evaluator and architecture documentation
f748b3d

mbochniak01 commited on

Add full RAG evaluation pipeline with L1 metrics and UI
ebe934f

mbochniak01 commited on

initial commit
b917936

mbochniak01 commited on