Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Spaces:
Nomearod
/
agentbench
like
0
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
main
agentbench
/
agent_bench
/
evaluation
/
datasets
61.4 kB
Ctrl+K
Ctrl+K
4 contributors
History:
8 commits
Nomearod
feat(goldens): add source_snippets to 8 FastAPI calibration items
a48afb9
2 months ago
calibration_v1.json
3.03 kB
feat(calibration): 30-item stratified calibration_v1 sample
2 months ago
k8s_golden.json
31.8 kB
feat(eval): Week 1 step 5 β 25-question K8s golden dataset + grounded_refusal fix
3 months ago
k8s_golden_pilot.json
8.05 kB
feat: K8s pilot corpus β 8 pages + config entry + JSON rewrite
3 months ago
tech_docs_golden.json
18.5 kB
feat(goldens): add source_snippets to 8 FastAPI calibration items
2 months ago