Nomearod Claude Opus 4.6 (1M context) commited on
Commit
5d4b3fe
·
1 Parent(s): 086ad86

docs: defer HF Space rename — outstanding applications reference current URL

Browse files

Revert the HF URL update from commit 086ad86 (4 references in
README.md Live Demo section). The tagline reframe and rename
decision itself stand — keep repo name agent-bench, reframe via
tagline — but the associated HF Space rename from
Nomearod/agentbench to Nomearod/agent-bench cannot happen this
week. Job applications submitted the preceding week reference the
current URL (nomearod-agentbench.hf.space), and HF Spaces does not
redirect renamed Spaces; renaming now would break the inbound links
on every live application. Rename absorbs cleanly in about a week
once the application reference window expires.

Update DECISIONS.md parallel-tracks item #5 closure entry to
explicitly note the deferral reason and the follow-up commit that
will switch the URL references once the rename happens. The
rename-decision itself (reframe vs. rename-to-refusal-bench) is
still closed; only the HF consistency fix is deferred.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Files changed (2) hide show
  1. DECISIONS.md +12 -3
  2. README.md +4 -4
DECISIONS.md CHANGED
@@ -1345,9 +1345,18 @@ and decision criteria before measuring.
1345
  honest-evaluation positioning without the rename cost:
1346
  > "A RAG benchmark built from primitives, with honest
1347
  > evaluation of retrieval, refusal, and grounded citation."
1348
- HF Space `Nomearod/agentbench` renamed to `Nomearod/agent-bench`
1349
- for consistency with GitHub repo name at the same time (absorbs
1350
- the HF URL break before the first LinkedIn post links out).
 
 
 
 
 
 
 
 
 
1351
 
1352
  6. **OpenAI snapshot drift bisection.** Mar 25 → Apr 12 P@5 slide;
1353
  the model pin at `77017db` (`gpt-4o-mini-2024-07-18`) removed
 
1345
  honest-evaluation positioning without the rename cost:
1346
  > "A RAG benchmark built from primitives, with honest
1347
  > evaluation of retrieval, refusal, and grounded citation."
1348
+ HF Space rename (`Nomearod/agentbench` `Nomearod/agent-bench`
1349
+ for GitHub-name consistency) is a separate, smaller follow-up
1350
+ deferred approximately one week. Reason: several job
1351
+ applications submitted the preceding week reference the current
1352
+ HF URL (`nomearod-agentbench.hf.space`); renaming the Space now
1353
+ would break those inbound links with no HF-side redirect. The
1354
+ rename absorbs cleanly once the application wave lands and the
1355
+ reference window expires. Until then the README, dashboard, and
1356
+ DECISIONS.md continue to reference the current `agentbench` URL;
1357
+ launch-adjacent work (Post #1, screenshots, cold-start measure)
1358
+ uses the current URL and will be updated in a single small
1359
+ follow-up commit when the rename happens.
1360
 
1361
  6. **OpenAI snapshot drift bisection.** Mar 25 → Apr 12 P@5 slide;
1362
  the model pin at `77017db` (`gpt-4o-mini-2024-07-18`) removed
README.md CHANGED
@@ -45,21 +45,21 @@ API providers are directly comparable (same config). The self-hosted row uses `m
45
 
46
  ## Live Demo
47
 
48
- **https://nomearod-agent-bench.hf.space** (Hugging Face Spaces — first request after idle may take ~30s for cold start)
49
 
50
  ```bash
51
  # In-scope question (expect answer with sources)
52
- curl -X POST https://nomearod-agent-bench.hf.space/ask \
53
  -H "Content-Type: application/json" \
54
  -d '{"question": "How do I define a path parameter in FastAPI?"}'
55
 
56
  # Out-of-scope question (expect grounded refusal)
57
- curl -X POST https://nomearod-agent-bench.hf.space/ask \
58
  -H "Content-Type: application/json" \
59
  -d '{"question": "How do I cook pasta?"}'
60
 
61
  # Health check
62
- curl https://nomearod-agent-bench.hf.space/health
63
  ```
64
 
65
  ## Quick Start (Local)
 
45
 
46
  ## Live Demo
47
 
48
+ **https://nomearod-agentbench.hf.space** (Hugging Face Spaces — first request after idle may take ~30s for cold start)
49
 
50
  ```bash
51
  # In-scope question (expect answer with sources)
52
+ curl -X POST https://nomearod-agentbench.hf.space/ask \
53
  -H "Content-Type: application/json" \
54
  -d '{"question": "How do I define a path parameter in FastAPI?"}'
55
 
56
  # Out-of-scope question (expect grounded refusal)
57
+ curl -X POST https://nomearod-agentbench.hf.space/ask \
58
  -H "Content-Type: application/json" \
59
  -d '{"question": "How do I cook pasta?"}'
60
 
61
  # Health check
62
+ curl https://nomearod-agentbench.hf.space/health
63
  ```
64
 
65
  ## Quick Start (Local)