tune(rag): calibrate relevance filter for raw logit scores 0110ecb remdms Claude Opus 4.6 commited on Apr 1
feat(eval): add benchmark script to collect Gemini filter ground truth e623f4f remdms Claude Sonnet 4.6 commited on Apr 1