Commit History
Diversity probe sweep + V1->V4 narrative README + HF Spaces entry + DV/incomplete fixes 97e19ad
V4.3: prompt-injection audit, input length cap, per-layer ablation, unguarded baseline, limitations restored 847587d
V4.2 part 3: sycophancy guard, F-1 flag decay, incomplete-message handler d808a62
V4.2 part 1: fix broken ISSS URL, ship URL audit + Karthik data brief 8fdff5c
V4: streaming, controlled paraphrasing, support plan, voice, sweeps 655c300
Add Eval B safety supplement 433900d
Ingest Core dataset and harden router policy f046303
Prepare Core dataset intake and resource registry e143b4a
Polish peer helper and scope handling ea1618f
Add Core safety metadata and eval summaries d50d1e1
Implement EmpathRAG Core hybrid router b2f5c42
Add Karthik eval harness and safety patches a246513
Start V2.5 support navigator hardening 79a6369
Checkpoint V2 curated support navigator 15594c0
Add curated corpus integration scaffold fadd796
Start v2 safety hardening 81deeef
Clean repo: fix README with verified metrics, pin requirements, update gitignore, remove log from tracking 404da58
Mukul Rayana commited on
Add conversation memory, fix Gradio stack, improve SYSTEM_PROMPT, log human eval turns 15920d0
Mukul Rayana commited on
Fix Condition C ablation - pure FAISS order, no safety score bias - mean=0.50 4e44b55
Mukul Rayana commited on
Add ablation Condition C eval - mean alignment 0.40 vs D=0.88 d64bbe6
Mukul Rayana commited on
fix: MistralJudge schema detection for truths/claims/verdicts β matches DeepEval 3-call pattern (Day 15) ce15608
Mukul Rayana commited on
fix: MistralJudge uses from_json_schema grammar β correct structure for claim extraction and verification (Day 15) 5405e38
Mukul Rayana commited on
fix: GBNF grammar-constrained JSON sampling in MistralJudge β eliminates DeepEval JSON parse errors (Day 15) c9853ac
Mukul Rayana commited on
fix: MistralJudge JSON extraction for DeepEval faithfulness, commit Wilcoxon results (Day 15) 2668471
Mukul Rayana commited on
fix: wilcoxon sys.path for condition_a import, zero_method=pratt for binary scores (Day 15) 7fc5654
Mukul Rayana commited on
fix: Wilcoxon uses stub guardrail β tests retrieval quality on all 50 prompts not 13 (Day 15) 68b0d74
Mukul Rayana commited on
feat: DeepEval FaithfulnessMetric with Mistral judge, async_mode=False (replaces RAGAS, Day 15) d02b074
Mukul Rayana commited on
fix: guardrail dual-import path, bertscore key names, ragas reuse pipeline.llm (Day 14) 9bce0e0
Mukul Rayana commited on
eval: save adversarial results at both t=0.50 and t=0.85, reset production threshold to 0.50 (Day 14) 9f77f5f
Mukul Rayana commited on
fix: threshold sweep β calibrate DeBERTa guardrail threshold, patch eval imports (Day 14) d5f8958
Mukul Rayana commited on
feat: eval scripts β BM25 baseline, adversarial probes, BERTScore, RAGAS, Wilcoxon (Day 14) 78fc1e6
Mukul Rayana commited on
feat: Gradio demo app, DeBERTa Colab notebook, updated smoke test results (Day 13) d471138
Mukul Rayana commited on
Add eval suite: test prompts, adversarial probes, BERTScore references (Day 12) 5c84477
Mukul Rayana commited on
Add pipeline orchestrator + smoke test β 4/5 emotion predictions correct (Day 12) 8b1f355
Mukul Rayana commited on
Day 1: data pipeline, session tracker, query router, adversarial probes, Colab training notebooks bc3ba9e
Mukul Rayana commited on