# Workspace Guide ("What lives where") This is the single orientation page for new contributors. ## 1) Production surface Use these when you want real user-facing behavior: - **Community agent/tooling** - Card: `.fast-agent/tool-cards/hf_hub_community.md` - Backend function tool: `.fast-agent/tool-cards/hf_api_tool.py` - Focus: Hub users/orgs/discussions/collections/activity API workflows - **Papers search agent/tooling** - Card: `.fast-agent/tool-cards/hf_paper_search.md` - Backend function tool: `.fast-agent/tool-cards/hf_papers_tool.py` - Focus: `/api/daily_papers` filtering and retrieval --- ## 2) Eval inputs (challenge sets) - `scripts/hf_hub_community_challenges.txt` - `scripts/hf_hub_community_coverage_prompts.json` - `scripts/tool_routing_challenges.txt` - `scripts/tool_routing_expected.json` - `scripts/tool_description_variants.json` These are the canonical prompt sets/configs used for reproducible scoring. --- ## 3) Eval execution scripts - `scripts/score_hf_hub_community_challenges.py` - Runs + scores the community challenge pack. - `scripts/score_hf_hub_community_coverage.py` - Runs + scores endpoint-coverage prompts that avoid overlap with the core challenge pack. - `scripts/score_tool_routing_confusion.py` - Scores tool-routing quality for a single model. - `scripts/run_tool_routing_batch.py` - Runs routing eval across many models + creates aggregate summary. - `scripts/eval_tool_description_ab.py` - A/B tests tool-description variants across models. - `scripts/eval_hf_hub_prompt_ab.py` - A/B compares prompt/card variants using both challenge and coverage packs, with summary plots. - `scripts/plot_tool_description_eval.py` - Generates plots from A/B summary CSV. --- ## 4) Eval outputs (results) - Community challenge reports: - `docs/hf_hub_community_challenge_report.md` - `docs/hf_hub_community_challenge_report.json` - Tool routing results: - `docs/tool_routing_eval/` - Tool description A/B outputs: - `docs/tool_description_eval/` --- ## 5) Instructions / context docs - `docs/hf_hub_community_challenge_pack.md` - `docs/tool_description_eval_setup.md` - `docs/tool_description_eval/tool_description_interpretation.md` - `bench.md` --- ## 6) Suggested newcomer workflow 1. Read this file + top-level `README.md`. 2. Run one production query for each agent. 3. Run one scoring script (community or routing). 4. Inspect generated markdown report in `docs/`. 5. Only then edit tool cards or script logic. --- ## 7) Results at a glance - `docs/RESULTS.md` is the index page for all generated reports and plots.