Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
Nomearod
/
agentbench
like
0
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
main
agentbench
/
agent_bench
/
evaluation
/
datasets
61.4 kB
Ctrl+K
Ctrl+K
4 contributors
History:
8 commits
Nomearod
feat(goldens): add source_snippets to 8 FastAPI calibration items
a48afb9
10 days ago
calibration_v1.json
Safe
3.03 kB
feat(calibration): 30-item stratified calibration_v1 sample
10 days ago
k8s_golden.json
Safe
31.8 kB
feat(eval): Week 1 step 5 β 25-question K8s golden dataset + grounded_refusal fix
about 1 month ago
k8s_golden_pilot.json
Safe
8.05 kB
feat: K8s pilot corpus β 8 pages + config entry + JSON rewrite
about 1 month ago
tech_docs_golden.json
Safe
18.5 kB
feat(goldens): add source_snippets to 8 FastAPI calibration items
10 days ago