docs: commit eval summary; clarify critic as LLM-assisted-judge; fix test imports 7624a2f MukulRay commited on Apr 21