Spaces:

evalstate
/

hf-papers

Running

sync: promote hf_hub_community prompt v3 + add prompt/coverage harness

bba4fab verified about 1 month ago

2.62 kB

Workspace Guide ("What lives where")

This is the single orientation page for new contributors.

Use these when you want real user-facing behavior:

Community agent/tooling
- Card: .fast-agent/tool-cards/hf_hub_community.md
- Backend function tool: .fast-agent/tool-cards/hf_api_tool.py
- Focus: Hub users/orgs/discussions/collections/activity API workflows
Papers search agent/tooling
- Card: .fast-agent/tool-cards/hf_paper_search.md
- Backend function tool: .fast-agent/tool-cards/hf_papers_tool.py
- Focus: /api/daily_papers filtering and retrieval

These are the canonical prompt sets/configs used for reproducible scoring.

scripts/score_hf_hub_community_challenges.py
- Runs + scores the community challenge pack.
scripts/score_hf_hub_community_coverage.py
- Runs + scores endpoint-coverage prompts that avoid overlap with the core challenge pack.
scripts/score_tool_routing_confusion.py
- Scores tool-routing quality for a single model.
scripts/run_tool_routing_batch.py
- Runs routing eval across many models + creates aggregate summary.
scripts/eval_tool_description_ab.py
- A/B tests tool-description variants across models.
scripts/eval_hf_hub_prompt_ab.py
- A/B compares prompt/card variants using both challenge and coverage packs, with summary plots.
scripts/plot_tool_description_eval.py
- Generates plots from A/B summary CSV.

Community challenge reports:
- docs/hf_hub_community_challenge_report.md
- docs/hf_hub_community_challenge_report.json
Tool routing results:
- docs/tool_routing_eval/
Tool description A/B outputs:
- docs/tool_description_eval/