nl2sql-copilot / benchmarks /plot_results.py

Commit History

refactor(core): trace schema upgrade, verifier/executor sync, benchmark plot polish
e3e0ac5

Melika Kheirieh commited on

feat(bench): gold-aware EM/SM/ExecAcc + p50/p95; write per-stage means; richer plots
296a94d

Melika Kheirieh commited on

feat(bench): auto-detect latest run and plot per-stage latency + metrics summary
db1d448

Melika Kheirieh commited on

chore(factory): safely load .env via dotenv (with fallback under CI)
b21cd69

Melika Kheirieh commited on