agentbench / agent_bench /core /provider.py

Commit History

fix(judges,calibration,harness): three Codex adversarial-review findings
226b6f4

Nomearod Claude Opus 4.7 (1M context) commited on

chore(eval): pin gpt-4o-mini snapshot + wire fastapi golden_dataset + pre-commit tolerances
5c1f49f

Nomearod Claude Opus 4.6 (1M context) commited on

feat: infrastructure sprint β€” vLLM/Modal, Helm, Terraform (#8)
a9d4375

Jane Yeung Claude Opus 4.6 (1M context) commited on

feat: Anthropic Haiku benchmark + README with provider comparison
ade4c8b

Nomearod Claude Opus 4.6 (1M context) commited on

feat: implement Anthropic Claude provider
077b821

Nomearod Claude Opus 4.6 (1M context) commited on

feat: add streaming responses via SSE endpoint
7c40db3

Nomearod Claude Opus 4.6 (1M context) commited on

feat: add provider retry with backoff and API rate limiting
871820a

Nomearod Claude Opus 4.6 (1M context) commited on

fix: handle OpenAI rate limit / quota errors as 503 instead of 500
2a4cb78

Nomearod Claude Opus 4.6 (1M context) commited on

fix: address Day 1 audit findings
e9173a5

Nomearod Claude Opus 4.6 (1M context) commited on

feat: Day 1 β€” repo scaffolding, provider abstraction, config, tests
ef5d585

Nomearod Claude Opus 4.6 (1M context) commited on