Spaces:

evalstate
/

hf-papers

Running

hf-papers / AGENTS.md

sync: promote hf_hub_community prompt v3 + add prompt/coverage harness

bba4fab verified about 2 months ago

3.05 kB

AGENTS.md

This repository has two main purposes:

Use these for real user-facing behavior:

Hub Community agent/tooling
- Card: .fast-agent/tool-cards/hf_hub_community.md
- Tool backend: .fast-agent/tool-cards/hf_api_tool.py
- Focus: users/orgs/followers/discussions/collections/recent activity workflows
Daily Papers search agent/tooling
- Card: .fast-agent/tool-cards/hf_paper_search.md
- Tool backend: .fast-agent/tool-cards/hf_papers_tool.py
- Focus: /api/daily_papers retrieval + filtering

Canonical challenge/config files:

scripts/score_hf_hub_community_challenges.py
- Runs + scores the HF Hub community challenge pack
scripts/score_tool_routing_confusion.py
- Scores routing/confusion quality for one model
scripts/run_tool_routing_batch.py
- Batch wrapper for routing eval across multiple models
scripts/eval_tool_description_ab.py
- A/B evaluation of tool description variants
scripts/plot_tool_description_eval.py
- Plot/interpretation generation from summary outputs
scripts/run_all_evals.sh
- Convenience orchestrator for the full evaluation flow

Community challenge reports:
- docs/hf_hub_community_challenge_report.md
- docs/hf_hub_community_challenge_report.json
Routing evaluation outputs:
- docs/tool_routing_eval/
Tool-description A/B outputs:
- docs/tool_description_eval/

Top-level result index:

This project is hosted on Hugging Face Spaces at:

When publishing card/script updates, use the hf CLI (not ad-hoc manual edits) to keep deployment reproducible.

Typical flow:

Authenticate:
- hf auth login
Work in the local repo and validate changes.
Push updates to the Space repo with hf CLI workflows (e.g., clone/upload/commit via hf commands) targeting:
- spaces/evalstate/hf-papers

Keep production card changes (.fast-agent/tool-cards/) and related eval/report updates in sync when publishing.