Commit History

Sync: compliance mapping, anti-gaming, 55 tests, mandatory stdout format, pivoting+compliance weights
c1a5935
verified

anshumanatrey commited on

Update: three-tier reasoning benchmark, real LLM scores, industry stats, pivoting score
a92d3db
verified

anshumanatrey commited on

Upload folder using huggingface_hub
4057030
verified

anshumanatrey commited on

Upload folder using huggingface_hub
97aee49
verified

anshumanatrey commited on

Upload folder using huggingface_hub
a44a9e6
verified

anshumanatrey commited on

Upload folder using huggingface_hub
02aecab
verified

anshumanatrey commited on

Upload folder using huggingface_hub
09711a8
verified

anshumanatrey commited on

Upload folder using huggingface_hub
2b85191
verified

anshumanatrey commited on