Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Spaces:
Nomearod
/
agentbench
like
0
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
main
agentbench
/
docs
/
plans
780 kB
Ctrl+K
Ctrl+K
4 contributors
History:
8 commits
Nomearod
docs(plans): judge-layer v1 implementation plan β 12 phases, ~50 tasks
171022a
about 2 months ago
2026-03-24-day1-repo-provider.md
31.8 kB
style: fix ruff lint β import sorting, line length
3 months ago
2026-03-24-v2-implementation-plan.md
11.8 kB
style: fix ruff lint β import sorting, line length
3 months ago
2026-03-25-v2-revised-design.md
18.8 kB
style: fix ruff lint β import sorting, line length
3 months ago
2026-03-27-langchain-baseline.md
40.3 kB
style: fix ruff lint β import sorting, line length
3 months ago
2026-03-30-infra-sprint-design.md
23.8 kB
style: fix ruff lint β import sorting, line length
3 months ago
2026-03-30-infra-sprint-implementation.md
54.3 kB
style: fix ruff lint β import sorting, line length
3 months ago
2026-03-31-security-hardening-design.md
14.5 kB
style: fix ruff lint β import sorting, line length
3 months ago
2026-03-31-security-hardening-implementation.md
70.9 kB
style: fix ruff lint β import sorting, line length
3 months ago
2026-04-10-showcase-ui-design.md
17.4 kB
style: fix ruff lint β import sorting, line length
3 months ago
2026-04-10-sse-stage-events-implementation.md
54.4 kB
style: fix ruff lint β import sorting, line length
3 months ago
2026-04-12-multi-corpus-refactor-design.md
20.9 kB
docs: multi-corpus refactor design
3 months ago
2026-04-12-multi-corpus-refactor-implementation.md
52.4 kB
docs: multi-corpus refactor implementation plan
3 months ago
2026-04-15-owasp-llm-top-10-mapping-design.md
39 kB
docs(plan): Part A design self-review fixes (LLM02 consistency, anti-padding template, paired-review gate)
2 months ago
2026-04-15-owasp-llm-top-10-mapping-implementation.md
58.7 kB
docs(plan): add Part A OWASP mapping implementation plan
2 months ago
2026-05-04-judge-layer-v1-design.md
46.5 kB
docs(plans): judge-layer v1 design β supersede continuous-scale judges with discrete-anchored 2-judge jury + ΞΊ-validated calibration
about 2 months ago
2026-05-04-judge-layer-v1-implementation.md
225 kB
docs(plans): judge-layer v1 implementation plan β 12 phases, ~50 tasks
about 2 months ago