nl2sql-copilot / benchmarks /evaluate_spider.py

Commit History

feat(benchmarks): align Spider eval with config-driven Pipeline and native Safety; log per-stage trace; add CSV summary
ed681b1

Melika Kheirieh commited on

feat(benchmarks): align Spider eval with config-driven Pipeline and native Safety; log per-stage trace; add CSV summary
598536c

Melika Kheirieh commited on

fix(types): resolve mypy errors and make pytest pass
eee3f75

Melika Kheirieh commited on

style: format code with ruff
105e019

Melika Kheirieh commited on

style: format code with ruff
c1bc4eb

Melika Kheirieh commited on

Add more advanced metrics
5eeca35

Melika Kheirieh commited on

Add first benchmark
e207f41

Melika Kheirieh commited on