Spaces:
Running
docs: defer HF Space rename — outstanding applications reference current URL
Browse filesRevert the HF URL update from commit 086ad86 (4 references in
README.md Live Demo section). The tagline reframe and rename
decision itself stand — keep repo name agent-bench, reframe via
tagline — but the associated HF Space rename from
Nomearod/agentbench to Nomearod/agent-bench cannot happen this
week. Job applications submitted the preceding week reference the
current URL (nomearod-agentbench.hf.space), and HF Spaces does not
redirect renamed Spaces; renaming now would break the inbound links
on every live application. Rename absorbs cleanly in about a week
once the application reference window expires.
Update DECISIONS.md parallel-tracks item #5 closure entry to
explicitly note the deferral reason and the follow-up commit that
will switch the URL references once the rename happens. The
rename-decision itself (reframe vs. rename-to-refusal-bench) is
still closed; only the HF consistency fix is deferred.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- DECISIONS.md +12 -3
- README.md +4 -4
|
@@ -1345,9 +1345,18 @@ and decision criteria before measuring.
|
|
| 1345 |
honest-evaluation positioning without the rename cost:
|
| 1346 |
> "A RAG benchmark built from primitives, with honest
|
| 1347 |
> evaluation of retrieval, refusal, and grounded citation."
|
| 1348 |
-
HF Space `Nomearod/agentbench`
|
| 1349 |
-
for
|
| 1350 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1351 |
|
| 1352 |
6. **OpenAI snapshot drift bisection.** Mar 25 → Apr 12 P@5 slide;
|
| 1353 |
the model pin at `77017db` (`gpt-4o-mini-2024-07-18`) removed
|
|
|
|
| 1345 |
honest-evaluation positioning without the rename cost:
|
| 1346 |
> "A RAG benchmark built from primitives, with honest
|
| 1347 |
> evaluation of retrieval, refusal, and grounded citation."
|
| 1348 |
+
HF Space rename (`Nomearod/agentbench` → `Nomearod/agent-bench`
|
| 1349 |
+
for GitHub-name consistency) is a separate, smaller follow-up
|
| 1350 |
+
deferred approximately one week. Reason: several job
|
| 1351 |
+
applications submitted the preceding week reference the current
|
| 1352 |
+
HF URL (`nomearod-agentbench.hf.space`); renaming the Space now
|
| 1353 |
+
would break those inbound links with no HF-side redirect. The
|
| 1354 |
+
rename absorbs cleanly once the application wave lands and the
|
| 1355 |
+
reference window expires. Until then the README, dashboard, and
|
| 1356 |
+
DECISIONS.md continue to reference the current `agentbench` URL;
|
| 1357 |
+
launch-adjacent work (Post #1, screenshots, cold-start measure)
|
| 1358 |
+
uses the current URL and will be updated in a single small
|
| 1359 |
+
follow-up commit when the rename happens.
|
| 1360 |
|
| 1361 |
6. **OpenAI snapshot drift bisection.** Mar 25 → Apr 12 P@5 slide;
|
| 1362 |
the model pin at `77017db` (`gpt-4o-mini-2024-07-18`) removed
|
|
@@ -45,21 +45,21 @@ API providers are directly comparable (same config). The self-hosted row uses `m
|
|
| 45 |
|
| 46 |
## Live Demo
|
| 47 |
|
| 48 |
-
**https://nomearod-
|
| 49 |
|
| 50 |
```bash
|
| 51 |
# In-scope question (expect answer with sources)
|
| 52 |
-
curl -X POST https://nomearod-
|
| 53 |
-H "Content-Type: application/json" \
|
| 54 |
-d '{"question": "How do I define a path parameter in FastAPI?"}'
|
| 55 |
|
| 56 |
# Out-of-scope question (expect grounded refusal)
|
| 57 |
-
curl -X POST https://nomearod-
|
| 58 |
-H "Content-Type: application/json" \
|
| 59 |
-d '{"question": "How do I cook pasta?"}'
|
| 60 |
|
| 61 |
# Health check
|
| 62 |
-
curl https://nomearod-
|
| 63 |
```
|
| 64 |
|
| 65 |
## Quick Start (Local)
|
|
|
|
| 45 |
|
| 46 |
## Live Demo
|
| 47 |
|
| 48 |
+
**https://nomearod-agentbench.hf.space** (Hugging Face Spaces — first request after idle may take ~30s for cold start)
|
| 49 |
|
| 50 |
```bash
|
| 51 |
# In-scope question (expect answer with sources)
|
| 52 |
+
curl -X POST https://nomearod-agentbench.hf.space/ask \
|
| 53 |
-H "Content-Type: application/json" \
|
| 54 |
-d '{"question": "How do I define a path parameter in FastAPI?"}'
|
| 55 |
|
| 56 |
# Out-of-scope question (expect grounded refusal)
|
| 57 |
+
curl -X POST https://nomearod-agentbench.hf.space/ask \
|
| 58 |
-H "Content-Type: application/json" \
|
| 59 |
-d '{"question": "How do I cook pasta?"}'
|
| 60 |
|
| 61 |
# Health check
|
| 62 |
+
curl https://nomearod-agentbench.hf.space/health
|
| 63 |
```
|
| 64 |
|
| 65 |
## Quick Start (Local)
|