Implement validation pipeline fixes (P1-P7) and experimental track system 28f1212 bshepp commited on 4 days ago
MedGemma validation: 50-case MedQA run, TGI endpoint config, prompt improvements 1f36481 bshepp commited on 5 days ago
docs: full accuracy audit, add validation framework to all docs, fix test_e2e.py, create TODO.md 9dea0ad bshepp commited on 6 days ago