| # Workspace Guide ("What lives where") | |
| This is the single orientation page for new contributors. | |
| ## 1) Production surface | |
| Use these when you want real user-facing behavior: | |
| - **Community agent/tooling** | |
| - Card: `.fast-agent/tool-cards/hf_hub_community.md` | |
| - Backend function tool: `.fast-agent/tool-cards/hf_api_tool.py` | |
| - Focus: Hub users/orgs/discussions/collections/activity API workflows | |
| - **Papers search agent/tooling** | |
| - Card: `.fast-agent/tool-cards/hf_paper_search.md` | |
| - Backend function tool: `.fast-agent/tool-cards/hf_papers_tool.py` | |
| - Focus: `/api/daily_papers` filtering and retrieval | |
| --- | |
| ## 2) Eval inputs (challenge sets) | |
| - `scripts/hf_hub_community_challenges.txt` | |
| - `scripts/hf_hub_community_coverage_prompts.json` | |
| - `scripts/tool_routing_challenges.txt` | |
| - `scripts/tool_routing_expected.json` | |
| - `scripts/tool_description_variants.json` | |
| These are the canonical prompt sets/configs used for reproducible scoring. | |
| --- | |
| ## 3) Eval execution scripts | |
| - `scripts/score_hf_hub_community_challenges.py` | |
| - Runs + scores the community challenge pack. | |
| - `scripts/score_hf_hub_community_coverage.py` | |
| - Runs + scores endpoint-coverage prompts that avoid overlap with the core challenge pack. | |
| - `scripts/score_tool_routing_confusion.py` | |
| - Scores tool-routing quality for a single model. | |
| - `scripts/run_tool_routing_batch.py` | |
| - Runs routing eval across many models + creates aggregate summary. | |
| - `scripts/eval_tool_description_ab.py` | |
| - A/B tests tool-description variants across models. | |
| - `scripts/eval_hf_hub_prompt_ab.py` | |
| - A/B compares prompt/card variants using both challenge and coverage packs, with summary plots. | |
| - `scripts/plot_tool_description_eval.py` | |
| - Generates plots from A/B summary CSV. | |
| --- | |
| ## 4) Eval outputs (results) | |
| - Community challenge reports: | |
| - `docs/hf_hub_community_challenge_report.md` | |
| - `docs/hf_hub_community_challenge_report.json` | |
| - Tool routing results: | |
| - `docs/tool_routing_eval/` | |
| - Tool description A/B outputs: | |
| - `docs/tool_description_eval/` | |
| --- | |
| ## 5) Instructions / context docs | |
| - `docs/hf_hub_community_challenge_pack.md` | |
| - `docs/tool_description_eval_setup.md` | |
| - `docs/tool_description_eval/tool_description_interpretation.md` | |
| - `bench.md` | |
| --- | |
| ## 6) Suggested newcomer workflow | |
| 1. Read this file + top-level `README.md`. | |
| 2. Run one production query for each agent. | |
| 3. Run one scoring script (community or routing). | |
| 4. Inspect generated markdown report in `docs/`. | |
| 5. Only then edit tool cards or script logic. | |
| --- | |
| ## 7) Results at a glance | |
| - `docs/RESULTS.md` is the index page for all generated reports and plots. | |