Spaces:
Running
Running
Deploy latest local hf-hub-query version
Browse files- _monty_codegen_shared.md +204 -6
- hf-hub-query.md +3 -3
- monty_api_tool_v2.py +421 -289
_monty_codegen_shared.md
CHANGED
|
@@ -2,6 +2,7 @@
|
|
| 2 |
- No imports.
|
| 3 |
- Helper functions are already in scope.
|
| 4 |
- All helper calls are async: always use `await`.
|
|
|
|
| 5 |
- Before sending the tool call, check that the wrapper both defines `solve(...)` and ends with `await solve(query, max_calls)`.
|
| 6 |
- Use helper functions first. Use raw `call_api('/api/...')` only if no helper fits.
|
| 7 |
- `call_api` must use a raw path starting with `/api/...`.
|
|
@@ -11,11 +12,13 @@
|
|
| 11 |
- When the user asks for specific fields or "return only ...", return exactly that final shape from `solve(...)` instead of a larger helper envelope.
|
| 12 |
- For bounded list/sample helpers in raw mode, prefer returning the helper envelope directly when coverage/limit metadata matters.
|
| 13 |
- For detail lookups, prefer returning a compact dict of relevant fields rather than the full raw helper response.
|
| 14 |
-
-
|
| 15 |
-
- For
|
| 16 |
-
- If the user
|
| 17 |
-
-
|
| 18 |
-
-
|
|
|
|
|
|
|
| 19 |
|
| 20 |
## Helper result shape
|
| 21 |
All helpers return:
|
|
@@ -35,6 +38,38 @@ Rules:
|
|
| 35 |
- `meta` contains helper-owned execution and coverage metadata. For bounded list/sample helpers this can include requested/applied limits, whether a default limit was used, exactness/completeness, whether more rows may be available, truncation cause, and a next-request hint.
|
| 36 |
- Helpers return rich default rows. Use `fields` to narrow output; use `advanced` only when you truly need backend-specific behavior beyond the default row.
|
| 37 |
- Exhaustive helpers such as graph/members/likes/activity can return substantially more than 100 rows when you request a larger `return_limit`; use helper `meta` (and the outer raw `meta.limit_summary`) to tell when limits were still hit.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
|
| 39 |
## Helper API
|
| 40 |
```py
|
|
@@ -95,6 +130,16 @@ await hf_user_graph(
|
|
| 95 |
fields: list[str] | None = None,
|
| 96 |
)
|
| 97 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 98 |
await hf_user_likes(
|
| 99 |
username: str | None = None, # None => current authenticated user
|
| 100 |
repo_types: list[str] | None = None,
|
|
@@ -130,6 +175,28 @@ await hf_whoami()
|
|
| 130 |
await call_api(endpoint: str, params: dict | None = None, method: str = "GET", json_body: dict | None = None)
|
| 131 |
```
|
| 132 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 133 |
## Common repo fields
|
| 134 |
Search/detail/trending repo rows commonly include:
|
| 135 |
- `repo_id`
|
|
@@ -151,21 +218,96 @@ Type-specific fields may also be present by default when available, such as:
|
|
| 151 |
- dataset: `description`, `paperswithcode_id`
|
| 152 |
- space: `sdk`, `models`, `datasets`, `subdomain`
|
| 153 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 154 |
## Usage guidance
|
| 155 |
- Use `hf_repo_search(...)` for find/search/top requests. Prefer dedicated args like `author=` over using `where` when a first-class helper argument exists.
|
| 156 |
- `hf_repo_search(...)` defaults to `repo_type="model"` when no repo type is specified. For prompts like "what repos does <author/org> have" or "list everything published by <author/org>", search across `repo_types=["model", "dataset", "space"]` unless the user explicitly asked for one type.
|
| 157 |
- Use `hf_repo_details(repo_type="auto", ...)` for `owner/name` detail lookups unless the type is explicit.
|
| 158 |
- Use `hf_trending(...)` only for true trending requests.
|
| 159 |
- `hf_trending(...)` does not accept extra filters like tag/author/task. For trending + extra filters, either ask a brief clarification or clearly label an approximation using `hf_repo_search(sort="trending_score", ...)`.
|
| 160 |
-
- Use `hf_user_summary(...)` for common "tell me about user X" prompts. It
|
|
|
|
| 161 |
- Use `hf_org_overview(...)` for organization details like display name, followers, and member count.
|
| 162 |
- Use `hf_org_members(...)` for organization member lists and counts. Member rows use `username`, `fullname`, `isPro`, and `role`; common aliases like `login`, `name`, and `is_pro` are tolerated in `fields=[...]`.
|
| 163 |
- Use `hf_user_graph(...)` for follower/following lists, counts, and filtered graph samples. Prefer `relation=` over trying undocumented helper names.
|
|
|
|
| 164 |
- For overlap/comparison/ranking tasks over followers, org members, likes, or activity, do not use small manual `return_limit` values like 10/20/50 unless the user explicitly asked for a sample. Use the helper default or a clearly high bound for the intermediate analysis, then keep only the final displayed result compact.
|
|
|
|
| 165 |
- Use `hf_user_likes(...)` for liked-repo prompts. Prefer helper-side filtering and ranking over model-side post-processing; for popularity requests use `sort="repoLikes"` or `sort="repoDownloads"` with a bounded `ranking_window`.
|
| 166 |
- For prompts like "most popular repository a user liked recently", call `hf_user_likes(username=..., sort="repoLikes", ranking_window=40, return_limit=1)` directly. Do not fetch default recent likes and manually re-rank them.
|
| 167 |
- `hf_user_likes(...)` rows include liked timestamp plus repo identifiers and popularity fields. Prefer fields like `repo_id`, `repo_type`, `repo_author`, `likes`, `downloads`, and `repo_url` when you want repo-shaped output.
|
| 168 |
- `hf_user_graph(...)` rows use `username`, `fullname`, and `isPro`. Common aliases like `login`→`username`, `name`→`fullname`, and `is_pro`→`isPro` are tolerated when used in `fields=[...]`, but prefer the canonical names in generated code.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 169 |
- `hf_user_graph(...)` also accepts organization names for `relation="followers"`. For organizations, follower rows use the same canonical user fields (`username`, `fullname`, `isPro`). Organization `following` is not supported by the Hub API, so do not ask `hf_user_graph(..., relation="following")` for an organization.
|
| 170 |
- Use `hf_recent_activity(...)` for activity-feed prompts. Prefer `feed_type` + `entity` rather than raw `call_api("/api/recent-activity", ...)`.
|
| 171 |
- `hf_recent_activity(...)` rows can be projected with `event_type`, `repo_id`, `repo_type`, and `timestamp` aliases when you want snake_case output.
|
|
@@ -226,6 +368,27 @@ return {
|
|
| 226 |
"latest_likes": item["likes"]["sample"],
|
| 227 |
}
|
| 228 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 229 |
# Popularity-ranked likes: helper-side shortlist enrichment + ranking
|
| 230 |
likes = await hf_user_likes(
|
| 231 |
username="julien-c",
|
|
@@ -250,6 +413,41 @@ return {
|
|
| 250 |
},
|
| 251 |
}
|
| 252 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 253 |
# Recent activity with snake_case aliases
|
| 254 |
activity = await hf_recent_activity(
|
| 255 |
feed_type="user",
|
|
|
|
| 2 |
- No imports.
|
| 3 |
- Helper functions are already in scope.
|
| 4 |
- All helper calls are async: always use `await`.
|
| 5 |
+
- `max_calls` is the overall external-call budget for the generated program, not a generic helper argument. You may use it inside `solve(...)` to bound loops or choose a cheaper fallback strategy, but do not pass it to helpers unless the helper signature explicitly includes `max_calls`.
|
| 6 |
- Before sending the tool call, check that the wrapper both defines `solve(...)` and ends with `await solve(query, max_calls)`.
|
| 7 |
- Use helper functions first. Use raw `call_api('/api/...')` only if no helper fits.
|
| 8 |
- `call_api` must use a raw path starting with `/api/...`.
|
|
|
|
| 12 |
- When the user asks for specific fields or "return only ...", return exactly that final shape from `solve(...)` instead of a larger helper envelope.
|
| 13 |
- For bounded list/sample helpers in raw mode, prefer returning the helper envelope directly when coverage/limit metadata matters.
|
| 14 |
- For detail lookups, prefer returning a compact dict of relevant fields rather than the full raw helper response.
|
| 15 |
+
- For structured requests, prefer compact JSON objects/arrays over prose or markdown tables. Use the user's requested field names when they are explicit, use the stable key names shown below when they are not, and omit unavailable fields unless the user explicitly asked for a fixed schema with nulls.
|
| 16 |
+
- For prompts that ask for both a sample/list and metadata, keep the sample compact and surface helper-owned metadata explicitly. Do not dump a very large item list just because metadata was requested.
|
| 17 |
+
- If the user asks for coverage/exactness/truncation metadata, prefer helper `meta` fields such as `exact_count`, `sample_complete`, `truncated`, `count_source`, `returned`, `total`, `total_matched`, `more_available`, and applied/requested limits.
|
| 18 |
+
- When the user asks for a sample plus metadata and does not specify a large sample size, default to a small sample (typically 10-20 rows) rather than returning hundreds of rows.
|
| 19 |
+
- Do not ask the user for their username just because they said "my" or "me". Use current-user helper behavior first.
|
| 20 |
+
- For current-user prompts, prefer helpers that support `username=None` for the authenticated user. Call `hf_whoami()` first when you need the explicit username for joins, comparisons, or output labeling.
|
| 21 |
+
- Only ask a follow-up for identity if `hf_whoami()` or a current-user helper fails because authentication/current-user resolution is unavailable.
|
| 22 |
|
| 23 |
## Helper result shape
|
| 24 |
All helpers return:
|
|
|
|
| 38 |
- `meta` contains helper-owned execution and coverage metadata. For bounded list/sample helpers this can include requested/applied limits, whether a default limit was used, exactness/completeness, whether more rows may be available, truncation cause, and a next-request hint.
|
| 39 |
- Helpers return rich default rows. Use `fields` to narrow output; use `advanced` only when you truly need backend-specific behavior beyond the default row.
|
| 40 |
- Exhaustive helpers such as graph/members/likes/activity can return substantially more than 100 rows when you request a larger `return_limit`; use helper `meta` (and the outer raw `meta.limit_summary`) to tell when limits were still hit.
|
| 41 |
+
- For metadata-oriented prompts, read and return the helper `meta` object (or a compact subset of it) instead of inferring coverage from list length alone.
|
| 42 |
+
|
| 43 |
+
## Typed graph shorthand
|
| 44 |
+
Use this graph mental model instead of reconstructing relations from raw endpoints.
|
| 45 |
+
|
| 46 |
+
```text
|
| 47 |
+
Node types
|
| 48 |
+
- U = user row {username, fullname, isPro, ...}
|
| 49 |
+
- O = org row {organization, displayName, followers, members, ...}
|
| 50 |
+
- R = repo row {repo_id, repo_type, author, likes, downloads, repo_url, ...}
|
| 51 |
+
- A = activity row {event_type, repo_id, repo_type, timestamp, ...}
|
| 52 |
+
- S = user summary {username, overview, followers?, following?, likes?, activity?}
|
| 53 |
+
|
| 54 |
+
Direct edges / helpers
|
| 55 |
+
- U -followers-> U => hf_user_graph(relation="followers")
|
| 56 |
+
- U -following-> U => hf_user_graph(relation="following")
|
| 57 |
+
- U -likes-> R => hf_user_likes(username=...)
|
| 58 |
+
- R -liked_by-> U => hf_repo_likers(repo_id=..., repo_type=...)
|
| 59 |
+
- O -members-> U => hf_org_members(organization=...)
|
| 60 |
+
- O -repos-> R => hf_repo_search(author="<org>", repo_types=[...])
|
| 61 |
+
- U/O -activity-> A => hf_recent_activity(feed_type=..., entity=...)
|
| 62 |
+
- R -details-> R => hf_repo_details(...)
|
| 63 |
+
- U -summary-> S => hf_user_summary(...)
|
| 64 |
+
- O -overview-> O => hf_org_overview(...)
|
| 65 |
+
```
|
| 66 |
+
|
| 67 |
+
Rules:
|
| 68 |
+
- Prefer the helper that already matches the requested edge direction.
|
| 69 |
+
- Do not reverse a relation indirectly if a direct helper exists.
|
| 70 |
+
- If you already know an author/org and need repos, go straight to `hf_repo_search(author=...)`.
|
| 71 |
+
- Read profile/social/link fields from `hf_user_summary(... )["item"]["overview"]`, not from graph rows.
|
| 72 |
+
- Use canonical row fields in generated code: user rows use `username`, repo rows use `repo_id`/`repo_type`, activity rows use `event_type`/`repo_id`/`repo_type`/`timestamp`.
|
| 73 |
|
| 74 |
## Helper API
|
| 75 |
```py
|
|
|
|
| 130 |
fields: list[str] | None = None,
|
| 131 |
)
|
| 132 |
|
| 133 |
+
await hf_repo_likers(
|
| 134 |
+
repo_id: str,
|
| 135 |
+
repo_type: str, # model|dataset|space
|
| 136 |
+
return_limit: int | None = None,
|
| 137 |
+
count_only: bool = False,
|
| 138 |
+
pro_only: bool | None = None,
|
| 139 |
+
where: dict | None = None,
|
| 140 |
+
fields: list[str] | None = None,
|
| 141 |
+
)
|
| 142 |
+
|
| 143 |
await hf_user_likes(
|
| 144 |
username: str | None = None, # None => current authenticated user
|
| 145 |
repo_types: list[str] | None = None,
|
|
|
|
| 175 |
await call_api(endpoint: str, params: dict | None = None, method: str = "GET", json_body: dict | None = None)
|
| 176 |
```
|
| 177 |
|
| 178 |
+
## Important nested result contracts
|
| 179 |
+
### `hf_user_summary(...)`
|
| 180 |
+
`hf_user_summary(...)` returns the normal helper envelope. The main payload is in `summary["item"]`.
|
| 181 |
+
|
| 182 |
+
```py
|
| 183 |
+
summary["item"] == {
|
| 184 |
+
"username": str,
|
| 185 |
+
"overview": dict, # profile + socials + counts
|
| 186 |
+
"followers": {"count": int | None, "sample": list[dict]} | None,
|
| 187 |
+
"following": {"count": int | None, "sample": list[dict]} | None,
|
| 188 |
+
"likes": {"count": int | None, "sample": list[dict]} | None,
|
| 189 |
+
"activity": {"count": int | None, "sample": list[dict]} | None,
|
| 190 |
+
}
|
| 191 |
+
```
|
| 192 |
+
|
| 193 |
+
Read profile/social fields from `summary["item"]["overview"]`, commonly:
|
| 194 |
+
- `websiteUrl`
|
| 195 |
+
- `twitter`, `github`, `linkedin`, `bluesky`
|
| 196 |
+
- `twitterHandle`, `githubHandle`, `linkedinHandle`, `blueskyHandle`
|
| 197 |
+
- `followers`, `following`, `likes`
|
| 198 |
+
- `isPro`
|
| 199 |
+
|
| 200 |
## Common repo fields
|
| 201 |
Search/detail/trending repo rows commonly include:
|
| 202 |
- `repo_id`
|
|
|
|
| 218 |
- dataset: `description`, `paperswithcode_id`
|
| 219 |
- space: `sdk`, `models`, `datasets`, `subdomain`
|
| 220 |
|
| 221 |
+
## Common user overview fields
|
| 222 |
+
`hf_user_summary(... )["item"]["overview"]` commonly includes:
|
| 223 |
+
- `username`
|
| 224 |
+
- `fullname`
|
| 225 |
+
- `bio`
|
| 226 |
+
- `avatarUrl`
|
| 227 |
+
- `websiteUrl`
|
| 228 |
+
- `twitter`
|
| 229 |
+
- `github`
|
| 230 |
+
- `linkedin`
|
| 231 |
+
- `bluesky`
|
| 232 |
+
- `twitterHandle`
|
| 233 |
+
- `githubHandle`
|
| 234 |
+
- `linkedinHandle`
|
| 235 |
+
- `blueskyHandle`
|
| 236 |
+
- `followers`
|
| 237 |
+
- `following`
|
| 238 |
+
- `likes`
|
| 239 |
+
- `models`
|
| 240 |
+
- `datasets`
|
| 241 |
+
- `spaces`
|
| 242 |
+
- `discussions`
|
| 243 |
+
- `papers`
|
| 244 |
+
- `upvotes`
|
| 245 |
+
- `orgs`
|
| 246 |
+
- `isPro`
|
| 247 |
+
|
| 248 |
+
## Primary navigation paths
|
| 249 |
+
Choose the helper based on the **subject of the question** and the **smallest helper that already contains the needed fields**.
|
| 250 |
+
|
| 251 |
+
### Repo-centric questions
|
| 252 |
+
- Exact repo by id (`owner/name`) → `hf_repo_details(...)`
|
| 253 |
+
- Search/discovery/list/top repos → `hf_repo_search(...)`
|
| 254 |
+
- True trending requests → `hf_trending(...)`
|
| 255 |
+
- Repo discussions → `hf_repo_discussions(...)`
|
| 256 |
+
- Users who liked a specific repo / liker filtering / liker counts → `hf_repo_likers(...)`
|
| 257 |
+
|
| 258 |
+
### User-centric questions
|
| 259 |
+
- Profile / overview / "tell me about user X" → `hf_user_summary(...)`
|
| 260 |
+
- Followers / following / graph sampling → `hf_user_graph(...)`
|
| 261 |
+
- Repos a user liked → `hf_user_likes(...)`
|
| 262 |
+
- Recent actions / feed questions → `hf_recent_activity(...)`
|
| 263 |
+
|
| 264 |
+
### Organization-centric questions
|
| 265 |
+
- Organization details / counts → `hf_org_overview(...)`
|
| 266 |
+
- Organization members → `hf_org_members(...)`
|
| 267 |
+
- Organization repos → `hf_repo_search(author="<org>", repo_types=["model", "dataset", "space"])`
|
| 268 |
+
|
| 269 |
+
### Relationship direction matters
|
| 270 |
+
- `hf_user_likes(...)` = **user → repos**
|
| 271 |
+
- `hf_repo_likers(...)` = **repo → users**
|
| 272 |
+
- `hf_user_graph(...)` = **user/org → followers/following**
|
| 273 |
+
|
| 274 |
+
Pick the helper that already matches the direction of the question instead of trying to reconstruct the relation indirectly.
|
| 275 |
+
|
| 276 |
+
## Efficiency rules
|
| 277 |
+
- Prefer **one helper call with local aggregation/filtering** over **many follow-up detail calls**.
|
| 278 |
+
- If the current helper already exposes the needed field, use it directly instead of hydrating each row with another helper.
|
| 279 |
+
- Use `hf_user_summary(username=...)` per user only when you truly need profile/social fields that are not already present in the current row set.
|
| 280 |
+
- For overlap/comparison/ranking tasks, fetch a broad enough working set first, then compute locally in generated code.
|
| 281 |
+
- Keep final displayed results compact, but do not artificially shrink intermediate helper coverage unless the user explicitly asked for a sample.
|
| 282 |
+
|
| 283 |
+
### Important anti-patterns
|
| 284 |
+
- For repo liker questions, do **not** call `hf_user_summary(...)` for each liker just to determine `isPro` or `type`. `hf_repo_likers(...)` already returns `username`, `fullname`, `type`, and `isPro`.
|
| 285 |
+
- For follower/following questions, do **not** use `hf_user_summary(...)` to reconstruct the graph. Use `hf_user_graph(...)`.
|
| 286 |
+
- For "repos by author/org" questions, do **not** search semantically first if the author is already known. Start with `hf_repo_search(author=..., ...)`.
|
| 287 |
+
- For "most popular repo a user liked" questions, do **not** fetch recent likes and manually re-rank them. Use `hf_user_likes(..., sort="repoLikes" | "repoDownloads")`.
|
| 288 |
+
|
| 289 |
## Usage guidance
|
| 290 |
- Use `hf_repo_search(...)` for find/search/top requests. Prefer dedicated args like `author=` over using `where` when a first-class helper argument exists.
|
| 291 |
- `hf_repo_search(...)` defaults to `repo_type="model"` when no repo type is specified. For prompts like "what repos does <author/org> have" or "list everything published by <author/org>", search across `repo_types=["model", "dataset", "space"]` unless the user explicitly asked for one type.
|
| 292 |
- Use `hf_repo_details(repo_type="auto", ...)` for `owner/name` detail lookups unless the type is explicit.
|
| 293 |
- Use `hf_trending(...)` only for true trending requests.
|
| 294 |
- `hf_trending(...)` does not accept extra filters like tag/author/task. For trending + extra filters, either ask a brief clarification or clearly label an approximation using `hf_repo_search(sort="trending_score", ...)`.
|
| 295 |
+
- Use `hf_user_summary(...)` for common "tell me about user X" prompts. It returns a fixed structured object (no `fields=` projection) with overview data and optional sampled followers/following/likes/activity sections. Read profile and social-link fields such as `websiteUrl`, `twitter`, `github`, `linkedin`, and `bluesky` from `summary["item"]["overview"]`.
|
| 296 |
+
- For "my/me" prompts, prefer current-user forms first: `hf_user_summary(username=None)`, `hf_user_graph(username=None, ...)`, and `hf_user_likes(username=None, ...)`. Use `hf_whoami()` when you need the resolved username explicitly.
|
| 297 |
- Use `hf_org_overview(...)` for organization details like display name, followers, and member count.
|
| 298 |
- Use `hf_org_members(...)` for organization member lists and counts. Member rows use `username`, `fullname`, `isPro`, and `role`; common aliases like `login`, `name`, and `is_pro` are tolerated in `fields=[...]`.
|
| 299 |
- Use `hf_user_graph(...)` for follower/following lists, counts, and filtered graph samples. Prefer `relation=` over trying undocumented helper names.
|
| 300 |
+
- Use `hf_repo_likers(...)` for "who liked this repo?" prompts. It returns liker rows for a specific model, dataset, or space; pass `repo_type` explicitly.
|
| 301 |
- For overlap/comparison/ranking tasks over followers, org members, likes, or activity, do not use small manual `return_limit` values like 10/20/50 unless the user explicitly asked for a sample. Use the helper default or a clearly high bound for the intermediate analysis, then keep only the final displayed result compact.
|
| 302 |
+
- For follower/member social-link lookups, first fetch usernames with `hf_user_graph(...)` or `hf_org_members(...)`, then fetch each user's profile/social data with `hf_user_summary(username=...)`. If the follower/member set is large and the user did not specify a cap, ask for a limit or clearly indicate that the result may be partial because each user requires additional calls.
|
| 303 |
- Use `hf_user_likes(...)` for liked-repo prompts. Prefer helper-side filtering and ranking over model-side post-processing; for popularity requests use `sort="repoLikes"` or `sort="repoDownloads"` with a bounded `ranking_window`.
|
| 304 |
- For prompts like "most popular repository a user liked recently", call `hf_user_likes(username=..., sort="repoLikes", ranking_window=40, return_limit=1)` directly. Do not fetch default recent likes and manually re-rank them.
|
| 305 |
- `hf_user_likes(...)` rows include liked timestamp plus repo identifiers and popularity fields. Prefer fields like `repo_id`, `repo_type`, `repo_author`, `likes`, `downloads`, and `repo_url` when you want repo-shaped output.
|
| 306 |
- `hf_user_graph(...)` rows use `username`, `fullname`, and `isPro`. Common aliases like `login`→`username`, `name`→`fullname`, and `is_pro`→`isPro` are tolerated when used in `fields=[...]`, but prefer the canonical names in generated code.
|
| 307 |
+
- `hf_repo_likers(...)` rows use `username`, `fullname`, `type`, and `isPro`. Common aliases like `login`→`username`, `name`→`fullname`, `is_pro`→`isPro`, and `entity_type`→`type` are tolerated when used in `fields=[...]`, but prefer the canonical names in generated code.
|
| 308 |
+
- `hf_repo_likers(...)` is a one-shot liker list helper, not a cursor feed. Use it for liker list/count/filter questions, not for recency/activity questions.
|
| 309 |
+
- For liker count/breakdown questions, prefer `hf_repo_likers(..., count_only=True, where=...)` or a single broad `hf_repo_likers(...)` call with local aggregation. Do not hydrate each liker with `hf_user_summary(...)` just to recover `isPro` or `type`.
|
| 310 |
+
- `hf_repo_likers(...)` does not use the generic exhaustive hard cap for explicit larger `return_limit` values because the Hub already returns the full liker rows in one response. The default output is still compact unless you ask for more.
|
| 311 |
- `hf_user_graph(...)` also accepts organization names for `relation="followers"`. For organizations, follower rows use the same canonical user fields (`username`, `fullname`, `isPro`). Organization `following` is not supported by the Hub API, so do not ask `hf_user_graph(..., relation="following")` for an organization.
|
| 312 |
- Use `hf_recent_activity(...)` for activity-feed prompts. Prefer `feed_type` + `entity` rather than raw `call_api("/api/recent-activity", ...)`.
|
| 313 |
- `hf_recent_activity(...)` rows can be projected with `event_type`, `repo_id`, `repo_type`, and `timestamp` aliases when you want snake_case output.
|
|
|
|
| 368 |
"latest_likes": item["likes"]["sample"],
|
| 369 |
}
|
| 370 |
|
| 371 |
+
# Followers' GitHub links: fetch usernames first, then read overview socials
|
| 372 |
+
followers = await hf_user_graph(
|
| 373 |
+
relation="followers",
|
| 374 |
+
return_limit=20,
|
| 375 |
+
fields=["username"],
|
| 376 |
+
)
|
| 377 |
+
result = []
|
| 378 |
+
for row in followers["items"]:
|
| 379 |
+
uname = row.get("username")
|
| 380 |
+
if not uname:
|
| 381 |
+
continue
|
| 382 |
+
summary = await hf_user_summary(username=uname)
|
| 383 |
+
item = summary["item"] or (summary["items"][0] if summary["items"] else None)
|
| 384 |
+
if item is None:
|
| 385 |
+
continue
|
| 386 |
+
overview = item.get("overview", {})
|
| 387 |
+
github = overview.get("github")
|
| 388 |
+
if github is not None:
|
| 389 |
+
result.append({"username": uname, "github": github})
|
| 390 |
+
return result
|
| 391 |
+
|
| 392 |
# Popularity-ranked likes: helper-side shortlist enrichment + ranking
|
| 393 |
likes = await hf_user_likes(
|
| 394 |
username="julien-c",
|
|
|
|
| 413 |
},
|
| 414 |
}
|
| 415 |
|
| 416 |
+
# Repo likers: user-shaped liker rows for a specific repo
|
| 417 |
+
likers = await hf_repo_likers(
|
| 418 |
+
repo_id="mteb/leaderboard",
|
| 419 |
+
repo_type="space",
|
| 420 |
+
where={"type": "organization"},
|
| 421 |
+
fields=["username", "type", "isPro"],
|
| 422 |
+
)
|
| 423 |
+
return {
|
| 424 |
+
"repo_id": "mteb/leaderboard",
|
| 425 |
+
"repo_type": "space",
|
| 426 |
+
"organization_likers": likers["items"],
|
| 427 |
+
"meta": likers["meta"],
|
| 428 |
+
}
|
| 429 |
+
|
| 430 |
+
# Exact liker breakdown using helper-side counting
|
| 431 |
+
pro = await hf_repo_likers(
|
| 432 |
+
repo_id="openai/gpt-oss-120b",
|
| 433 |
+
repo_type="model",
|
| 434 |
+
where={"isPro": True},
|
| 435 |
+
count_only=True,
|
| 436 |
+
)
|
| 437 |
+
normal = await hf_repo_likers(
|
| 438 |
+
repo_id="openai/gpt-oss-120b",
|
| 439 |
+
repo_type="model",
|
| 440 |
+
where={"isPro": False},
|
| 441 |
+
count_only=True,
|
| 442 |
+
)
|
| 443 |
+
return {
|
| 444 |
+
"repo_id": "openai/gpt-oss-120b",
|
| 445 |
+
"repo_type": "model",
|
| 446 |
+
"pro_likers": pro["meta"]["total"],
|
| 447 |
+
"normal_likers": normal["meta"]["total"],
|
| 448 |
+
"exact_count": bool(pro["meta"].get("exact_count") and normal["meta"].get("exact_count")),
|
| 449 |
+
}
|
| 450 |
+
|
| 451 |
# Recent activity with snake_case aliases
|
| 452 |
activity = await hf_recent_activity(
|
| 453 |
feed_type="user",
|
hf-hub-query.md
CHANGED
|
@@ -4,7 +4,7 @@ name: hf_hub_query
|
|
| 4 |
model: hf.openai/gpt-oss-120b:cerebras
|
| 5 |
use_history: false
|
| 6 |
default: true
|
| 7 |
-
description: "
|
| 8 |
shell: false
|
| 9 |
skills: []
|
| 10 |
function_tools:
|
|
@@ -15,7 +15,7 @@ request_params:
|
|
| 15 |
|
| 16 |
reasoning: high
|
| 17 |
|
| 18 |
-
You are a **tool-using, read-only** Hugging Face Hub search/navigation agent
|
| 19 |
The user must never see your generated Python unless they explicitly ask for debugging.
|
| 20 |
|
| 21 |
## Mandatory first action
|
|
@@ -25,7 +25,7 @@ The user must never see your generated Python unless they explicitly ask for deb
|
|
| 25 |
- Never paste `async def solve(...)` into normal assistant text.
|
| 26 |
- Only skip the tool call if a brief clarification question is strictly required.
|
| 27 |
|
| 28 |
-
##
|
| 29 |
1. Read the user request.
|
| 30 |
2. Build an inner program in exactly this shape:
|
| 31 |
```py
|
|
|
|
| 4 |
model: hf.openai/gpt-oss-120b:cerebras
|
| 5 |
use_history: false
|
| 6 |
default: true
|
| 7 |
+
description: "Active natural-language Hugging Face Hub navigator, raw structured-output variant. Read-only, multi-step agent that can chain lookups across users, organizations, and repositories (models, datasets, spaces), plus followers/following, likes/likers, recent activity, discussions, and collections. Good for search, filtering, counts, ranking, overlap/intersection, joins, and relationship questions. Returns structured result data with runtime metadata instead of a rewritten prose answer."
|
| 8 |
shell: false
|
| 9 |
skills: []
|
| 10 |
function_tools:
|
|
|
|
| 15 |
|
| 16 |
reasoning: high
|
| 17 |
|
| 18 |
+
You are a **tool-using, read-only** Hugging Face Hub search/navigation agent.
|
| 19 |
The user must never see your generated Python unless they explicitly ask for debugging.
|
| 20 |
|
| 21 |
## Mandatory first action
|
|
|
|
| 25 |
- Never paste `async def solve(...)` into normal assistant text.
|
| 26 |
- Only skip the tool call if a brief clarification question is strictly required.
|
| 27 |
|
| 28 |
+
## Tool-call protocol
|
| 29 |
1. Read the user request.
|
| 30 |
2. Build an inner program in exactly this shape:
|
| 31 |
```py
|
monty_api_tool_v2.py
CHANGED
|
@@ -93,6 +93,25 @@ _SORT_KEY_ALIASES: dict[str, str] = {
|
|
| 93 |
"trending": "trending_score",
|
| 94 |
}
|
| 95 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 96 |
# Extra hf_repo_search kwargs intentionally supported as pass-through to
|
| 97 |
# huggingface_hub.HfApi.list_models/list_datasets/list_spaces.
|
| 98 |
# (Generic args like `query/search/sort/author/limit` are handled directly in
|
|
@@ -191,6 +210,7 @@ PAGINATION_POLICY: dict[str, dict[str, Any]] = {
|
|
| 191 |
"hf_user_followers": {"mode": "exhaustive", "scan_max": FOLLOWERS_SCAN_MAX, "default_return": 1_000},
|
| 192 |
"hf_user_following": {"mode": "exhaustive", "scan_max": FOLLOWERS_SCAN_MAX, "default_return": 1_000},
|
| 193 |
"hf_org_members": {"mode": "exhaustive", "scan_max": FOLLOWERS_SCAN_MAX, "default_return": 1_000},
|
|
|
|
| 194 |
"hf_user_likes": {
|
| 195 |
"mode": "exhaustive",
|
| 196 |
"scan_max": LIKES_SCAN_MAX,
|
|
@@ -217,6 +237,7 @@ HELPER_EXTERNALS = [
|
|
| 217 |
"hf_repo_search",
|
| 218 |
"hf_user_summary",
|
| 219 |
"hf_user_graph",
|
|
|
|
| 220 |
"hf_user_likes",
|
| 221 |
"hf_recent_activity",
|
| 222 |
"hf_repo_discussions",
|
|
@@ -243,6 +264,7 @@ ALLOWLIST_PATTERNS = [
|
|
| 243 |
r"^/api/users/[^/]+/followers$",
|
| 244 |
r"^/api/users/[^/]+/following$",
|
| 245 |
r"^/api/users/[^/]+/likes$",
|
|
|
|
| 246 |
r"^/api/organizations/[^/]+/overview$",
|
| 247 |
r"^/api/organizations/[^/]+/members$",
|
| 248 |
r"^/api/organizations/[^/]+/followers$",
|
|
@@ -260,6 +282,7 @@ STRICT_ALLOWLIST_PATTERNS = [
|
|
| 260 |
r"^/api/whoami-v2$",
|
| 261 |
r"^/api/trending$",
|
| 262 |
r"^/api/daily_papers$",
|
|
|
|
| 263 |
r"^/api/collections$",
|
| 264 |
r"^/api/collections/[^/]+$",
|
| 265 |
r"^/api/collections/[^/]+/[^/]+$",
|
|
@@ -956,40 +979,107 @@ async def _run_with_monty(
|
|
| 956 |
return "More results may exist; narrow filters or raise scan/page bounds for better coverage"
|
| 957 |
return "Ask for a larger limit to see more rows"
|
| 958 |
|
| 959 |
-
def
|
| 960 |
*,
|
| 961 |
-
|
| 962 |
-
|
| 963 |
-
|
| 964 |
-
|
| 965 |
-
|
|
|
|
|
|
|
| 966 |
) -> dict[str, Any]:
|
| 967 |
-
|
| 968 |
-
|
| 969 |
-
|
| 970 |
-
|
| 971 |
-
"
|
| 972 |
-
|
| 973 |
-
|
| 974 |
-
|
| 975 |
-
|
|
|
|
|
|
|
| 976 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 977 |
|
| 978 |
-
def
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 979 |
*,
|
| 980 |
start_calls: int,
|
| 981 |
source: str,
|
| 982 |
items: list[dict[str, Any]],
|
| 983 |
-
meta: dict[str, Any],
|
| 984 |
cursor: str | None = None,
|
|
|
|
|
|
|
| 985 |
) -> dict[str, Any]:
|
|
|
|
|
|
|
| 986 |
if cursor is not None:
|
| 987 |
-
|
|
|
|
| 988 |
return {
|
| 989 |
"ok": True,
|
| 990 |
"item": items[0] if len(items) == 1 else None,
|
| 991 |
"items": items,
|
| 992 |
-
"meta": _helper_meta(start_calls, source=source, **
|
| 993 |
"error": None,
|
| 994 |
}
|
| 995 |
|
|
@@ -1341,19 +1431,18 @@ async def _run_with_monty(
|
|
| 1341 |
|
| 1342 |
default_return = _policy_int("hf_org_members", "default_return", 100)
|
| 1343 |
scan_cap = _policy_int("hf_org_members", "scan_max", FOLLOWERS_SCAN_MAX)
|
| 1344 |
-
|
| 1345 |
-
|
| 1346 |
-
|
| 1347 |
-
|
| 1348 |
-
|
| 1349 |
-
|
| 1350 |
-
|
| 1351 |
-
|
| 1352 |
-
maximum=MAX_EXHAUSTIVE_RETURN_ITEMS,
|
| 1353 |
)
|
| 1354 |
-
|
| 1355 |
-
|
| 1356 |
-
hard_cap_applied =
|
| 1357 |
has_where = isinstance(where, dict) and bool(where)
|
| 1358 |
|
| 1359 |
overview_total: int | None = None
|
|
@@ -1369,37 +1458,25 @@ async def _run_with_monty(
|
|
| 1369 |
sample_complete = overview_total == 0
|
| 1370 |
more_available = False if sample_complete else True
|
| 1371 |
truncated_by = _derive_truncated_by(return_limit_hit=overview_total > 0)
|
| 1372 |
-
meta =
|
| 1373 |
-
|
| 1374 |
-
|
| 1375 |
-
|
| 1376 |
-
|
| 1377 |
-
|
| 1378 |
-
|
| 1379 |
-
|
| 1380 |
-
|
| 1381 |
-
|
| 1382 |
-
|
| 1383 |
-
|
| 1384 |
-
|
| 1385 |
-
|
| 1386 |
-
|
| 1387 |
-
|
| 1388 |
-
|
| 1389 |
-
|
| 1390 |
-
|
| 1391 |
-
applied_scan_limit=scan_lim,
|
| 1392 |
-
),
|
| 1393 |
-
"organization": org,
|
| 1394 |
-
}
|
| 1395 |
-
meta.update(_derive_limit_metadata(
|
| 1396 |
-
requested_return_limit=requested_return_limit,
|
| 1397 |
-
applied_return_limit=ret_lim,
|
| 1398 |
-
default_limit_used=default_limit_used,
|
| 1399 |
-
requested_scan_limit=requested_scan_limit,
|
| 1400 |
-
applied_scan_limit=scan_lim,
|
| 1401 |
-
))
|
| 1402 |
-
return _helper_success_meta(start_calls=start_calls, source=overview_source, items=[], meta=meta)
|
| 1403 |
|
| 1404 |
endpoint = f"/api/organizations/{org}/members"
|
| 1405 |
try:
|
|
@@ -1457,50 +1534,28 @@ async def _run_with_monty(
|
|
| 1457 |
items = _project_items(
|
| 1458 |
items,
|
| 1459 |
fields,
|
| 1460 |
-
aliases=
|
| 1461 |
-
|
| 1462 |
-
|
| 1463 |
-
|
| 1464 |
-
"
|
| 1465 |
-
"
|
| 1466 |
-
"
|
| 1467 |
-
"
|
| 1468 |
-
"
|
| 1469 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1470 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1471 |
)
|
| 1472 |
-
|
| 1473 |
-
"scanned": observed_total,
|
| 1474 |
-
"matched": len(normalized),
|
| 1475 |
-
"returned": len(items),
|
| 1476 |
-
"total": total,
|
| 1477 |
-
"total_available": total_available,
|
| 1478 |
-
"total_matched": total_matched,
|
| 1479 |
-
"truncated": truncated,
|
| 1480 |
-
"complete": sample_complete,
|
| 1481 |
-
"exact_count": exact_count,
|
| 1482 |
-
"count_source": count_source,
|
| 1483 |
-
"sample_complete": sample_complete,
|
| 1484 |
-
"lower_bound": bool(has_where and not exact_count),
|
| 1485 |
-
"more_available": more_available,
|
| 1486 |
-
"can_request_more": _derive_can_request_more(sample_complete=sample_complete, truncated_by=truncated_by),
|
| 1487 |
-
"truncated_by": truncated_by,
|
| 1488 |
-
"next_request_hint": _derive_next_request_hint(
|
| 1489 |
-
truncated_by=truncated_by,
|
| 1490 |
-
more_available=more_available,
|
| 1491 |
-
applied_return_limit=ret_lim,
|
| 1492 |
-
applied_scan_limit=scan_lim,
|
| 1493 |
-
),
|
| 1494 |
-
"organization": org,
|
| 1495 |
-
}
|
| 1496 |
-
meta.update(_derive_limit_metadata(
|
| 1497 |
-
requested_return_limit=requested_return_limit,
|
| 1498 |
-
applied_return_limit=ret_lim,
|
| 1499 |
-
default_limit_used=default_limit_used,
|
| 1500 |
-
requested_scan_limit=requested_scan_limit,
|
| 1501 |
-
applied_scan_limit=scan_lim,
|
| 1502 |
-
))
|
| 1503 |
-
return _helper_success_meta(start_calls=start_calls, source=endpoint, items=items, meta=meta)
|
| 1504 |
|
| 1505 |
async def hf_repo_search(
|
| 1506 |
query: str | None = None,
|
|
@@ -1692,19 +1747,18 @@ async def _run_with_monty(
|
|
| 1692 |
if not u:
|
| 1693 |
return _helper_error(start_calls=start_calls, source=f"/api/users/<u>/{kind}", error="username is required")
|
| 1694 |
|
| 1695 |
-
|
| 1696 |
-
|
| 1697 |
-
|
| 1698 |
-
|
| 1699 |
-
|
| 1700 |
-
|
| 1701 |
-
|
| 1702 |
-
|
| 1703 |
-
maximum=MAX_EXHAUSTIVE_RETURN_ITEMS,
|
| 1704 |
)
|
| 1705 |
-
|
| 1706 |
-
|
| 1707 |
-
hard_cap_applied =
|
| 1708 |
has_where = isinstance(where, dict) and bool(where)
|
| 1709 |
filtered = (pro_only is not None) or has_where
|
| 1710 |
|
|
@@ -1740,44 +1794,32 @@ async def _run_with_monty(
|
|
| 1740 |
sample_complete = overview_total == 0
|
| 1741 |
more_available = False if sample_complete else True
|
| 1742 |
truncated_by = _derive_truncated_by(return_limit_hit=overview_total > 0)
|
| 1743 |
-
meta =
|
| 1744 |
-
|
| 1745 |
-
|
| 1746 |
-
|
| 1747 |
-
|
| 1748 |
-
|
| 1749 |
-
|
| 1750 |
-
|
| 1751 |
-
|
| 1752 |
-
|
| 1753 |
-
|
| 1754 |
-
|
| 1755 |
-
|
| 1756 |
-
|
| 1757 |
-
|
| 1758 |
-
|
| 1759 |
-
|
| 1760 |
-
|
| 1761 |
-
|
| 1762 |
-
|
| 1763 |
-
|
| 1764 |
-
|
| 1765 |
-
|
| 1766 |
-
"where_applied": has_where,
|
| 1767 |
-
"entity": u,
|
| 1768 |
-
"entity_type": entity_type,
|
| 1769 |
-
"username": u,
|
| 1770 |
-
}
|
| 1771 |
if entity_type == "organization":
|
| 1772 |
meta["organization"] = u
|
| 1773 |
-
|
| 1774 |
-
requested_return_limit=requested_return_limit,
|
| 1775 |
-
applied_return_limit=ret_lim,
|
| 1776 |
-
default_limit_used=default_limit_used,
|
| 1777 |
-
requested_scan_limit=requested_scan_limit,
|
| 1778 |
-
applied_scan_limit=scan_lim,
|
| 1779 |
-
))
|
| 1780 |
-
return _helper_success_meta(
|
| 1781 |
start_calls=start_calls,
|
| 1782 |
source=overview_source,
|
| 1783 |
items=[],
|
|
@@ -1858,57 +1900,35 @@ async def _run_with_monty(
|
|
| 1858 |
items = _project_items(
|
| 1859 |
items,
|
| 1860 |
fields,
|
| 1861 |
-
aliases=
|
| 1862 |
-
|
| 1863 |
-
|
| 1864 |
-
|
| 1865 |
-
"
|
| 1866 |
-
"
|
| 1867 |
-
"
|
| 1868 |
-
"
|
| 1869 |
-
"
|
| 1870 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1871 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1872 |
)
|
| 1873 |
-
meta = {
|
| 1874 |
-
"scanned": observed_total,
|
| 1875 |
-
"matched": len(normalized),
|
| 1876 |
-
"returned": len(items),
|
| 1877 |
-
"total": total,
|
| 1878 |
-
"total_available": total_available,
|
| 1879 |
-
"total_matched": total_matched,
|
| 1880 |
-
"truncated": truncated,
|
| 1881 |
-
"complete": sample_complete,
|
| 1882 |
-
"exact_count": exact_count,
|
| 1883 |
-
"count_source": count_source,
|
| 1884 |
-
"sample_complete": sample_complete,
|
| 1885 |
-
"lower_bound": bool(filtered and not exact_count),
|
| 1886 |
-
"more_available": more_available,
|
| 1887 |
-
"can_request_more": _derive_can_request_more(sample_complete=sample_complete, truncated_by=truncated_by),
|
| 1888 |
-
"truncated_by": truncated_by,
|
| 1889 |
-
"next_request_hint": _derive_next_request_hint(
|
| 1890 |
-
truncated_by=truncated_by,
|
| 1891 |
-
more_available=more_available,
|
| 1892 |
-
applied_return_limit=ret_lim,
|
| 1893 |
-
applied_scan_limit=scan_lim,
|
| 1894 |
-
),
|
| 1895 |
-
"relation": kind,
|
| 1896 |
-
"pro_only": pro_only,
|
| 1897 |
-
"where_applied": has_where,
|
| 1898 |
-
"entity": u,
|
| 1899 |
-
"entity_type": entity_type,
|
| 1900 |
-
"username": u,
|
| 1901 |
-
}
|
| 1902 |
if entity_type == "organization":
|
| 1903 |
meta["organization"] = u
|
| 1904 |
-
|
| 1905 |
-
requested_return_limit=requested_return_limit,
|
| 1906 |
-
applied_return_limit=ret_lim,
|
| 1907 |
-
default_limit_used=default_limit_used,
|
| 1908 |
-
requested_scan_limit=requested_scan_limit,
|
| 1909 |
-
applied_scan_limit=scan_lim,
|
| 1910 |
-
))
|
| 1911 |
-
return _helper_success_meta(
|
| 1912 |
start_calls=start_calls,
|
| 1913 |
source=endpoint,
|
| 1914 |
items=items,
|
|
@@ -2164,19 +2184,18 @@ async def _run_with_monty(
|
|
| 2164 |
error="sort must be one of likedAt, repoLikes, repoDownloads",
|
| 2165 |
)
|
| 2166 |
|
| 2167 |
-
|
| 2168 |
-
|
| 2169 |
-
|
| 2170 |
-
|
| 2171 |
-
|
| 2172 |
-
|
| 2173 |
-
|
| 2174 |
-
|
| 2175 |
-
maximum=MAX_EXHAUSTIVE_RETURN_ITEMS,
|
| 2176 |
)
|
| 2177 |
-
|
| 2178 |
-
|
| 2179 |
-
hard_cap_applied =
|
| 2180 |
|
| 2181 |
allowed_repo_types: set[str] | None = None
|
| 2182 |
try:
|
|
@@ -2344,43 +2363,166 @@ async def _run_with_monty(
|
|
| 2344 |
if scan_limit_hit:
|
| 2345 |
more_available = "unknown" if (allowed_repo_types is not None or where) else True
|
| 2346 |
|
| 2347 |
-
meta =
|
| 2348 |
-
|
| 2349 |
-
|
| 2350 |
-
|
| 2351 |
-
|
| 2352 |
-
|
| 2353 |
-
|
| 2354 |
-
|
| 2355 |
-
|
| 2356 |
-
|
| 2357 |
-
|
| 2358 |
-
|
| 2359 |
-
|
| 2360 |
-
|
| 2361 |
-
|
| 2362 |
-
|
| 2363 |
-
|
| 2364 |
-
|
| 2365 |
-
|
| 2366 |
-
|
| 2367 |
-
|
| 2368 |
-
|
| 2369 |
-
|
| 2370 |
-
|
| 2371 |
-
|
| 2372 |
-
|
| 2373 |
-
|
| 2374 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2375 |
}
|
| 2376 |
-
|
| 2377 |
-
|
| 2378 |
-
|
| 2379 |
-
|
| 2380 |
-
|
| 2381 |
-
|
| 2382 |
-
|
| 2383 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2384 |
start_calls=start_calls,
|
| 2385 |
source=endpoint,
|
| 2386 |
items=items,
|
|
@@ -2421,9 +2563,7 @@ async def _run_with_monty(
|
|
| 2421 |
if start_cursor is None:
|
| 2422 |
start_cursor = startCursor or cursor
|
| 2423 |
|
| 2424 |
-
requested_return_limit = _resolve_requested_limit(return_limit, limit)
|
| 2425 |
requested_max_pages = max_pages
|
| 2426 |
-
effective_requested_return_limit = 0 if count_only else requested_return_limit
|
| 2427 |
|
| 2428 |
if isinstance(username, str) and username.strip():
|
| 2429 |
entity = username.strip()
|
|
@@ -2448,16 +2588,17 @@ async def _run_with_monty(
|
|
| 2448 |
if not ent:
|
| 2449 |
return _helper_error(start_calls=start_calls, source="/api/recent-activity", error="entity is required")
|
| 2450 |
|
| 2451 |
-
|
| 2452 |
-
|
| 2453 |
-
|
| 2454 |
-
|
| 2455 |
-
|
|
|
|
| 2456 |
)
|
|
|
|
| 2457 |
page_lim = page_cap
|
| 2458 |
pages_lim = _clamp_int(requested_max_pages, default=pages_cap, minimum=1, maximum=pages_cap)
|
| 2459 |
-
|
| 2460 |
-
hard_cap_applied = requested_return_limit is not None and ret_lim < requested_return_limit
|
| 2461 |
|
| 2462 |
type_filter = {str(t).strip().lower() for t in (activity_types or []) if str(t).strip()}
|
| 2463 |
repo_filter = {_canonical_repo_type(t, default="") for t in (repo_types or []) if str(t).strip()}
|
|
@@ -2562,41 +2703,31 @@ async def _run_with_monty(
|
|
| 2562 |
elif stopped_for_budget and not exact_count:
|
| 2563 |
more_available = "unknown"
|
| 2564 |
|
| 2565 |
-
meta =
|
| 2566 |
-
|
| 2567 |
-
|
| 2568 |
-
|
| 2569 |
-
|
| 2570 |
-
|
| 2571 |
-
|
| 2572 |
-
|
| 2573 |
-
|
| 2574 |
-
|
| 2575 |
-
|
| 2576 |
-
|
| 2577 |
-
|
| 2578 |
-
|
| 2579 |
-
|
| 2580 |
-
|
| 2581 |
-
|
| 2582 |
-
|
| 2583 |
-
|
| 2584 |
-
|
| 2585 |
-
|
| 2586 |
-
),
|
| 2587 |
-
"page_limit": page_lim,
|
| 2588 |
-
"stopped_for_budget": stopped_for_budget,
|
| 2589 |
-
"feed_type": ft,
|
| 2590 |
-
"entity": ent,
|
| 2591 |
-
}
|
| 2592 |
-
meta.update(_derive_limit_metadata(
|
| 2593 |
-
requested_return_limit=requested_return_limit,
|
| 2594 |
-
applied_return_limit=ret_lim,
|
| 2595 |
-
default_limit_used=default_limit_used,
|
| 2596 |
requested_max_pages=requested_max_pages,
|
| 2597 |
applied_max_pages=pages_lim,
|
| 2598 |
-
)
|
| 2599 |
-
return
|
| 2600 |
start_calls=start_calls,
|
| 2601 |
source="/api/recent-activity",
|
| 2602 |
items=items,
|
|
@@ -3009,6 +3140,7 @@ async def _run_with_monty(
|
|
| 3009 |
"hf_repo_search": _collecting_wrapper("hf_repo_search", hf_repo_search),
|
| 3010 |
"hf_user_summary": _collecting_wrapper("hf_user_summary", hf_user_summary),
|
| 3011 |
"hf_user_graph": _collecting_wrapper("hf_user_graph", hf_user_graph),
|
|
|
|
| 3012 |
"hf_user_likes": _collecting_wrapper("hf_user_likes", hf_user_likes),
|
| 3013 |
"hf_recent_activity": _collecting_wrapper("hf_recent_activity", hf_recent_activity),
|
| 3014 |
"hf_repo_discussions": _collecting_wrapper("hf_repo_discussions", hf_repo_discussions),
|
|
|
|
| 93 |
"trending": "trending_score",
|
| 94 |
}
|
| 95 |
|
| 96 |
+
_USER_FIELD_ALIASES: dict[str, str] = {
|
| 97 |
+
"login": "username",
|
| 98 |
+
"user": "username",
|
| 99 |
+
"handle": "username",
|
| 100 |
+
"name": "fullname",
|
| 101 |
+
"full_name": "fullname",
|
| 102 |
+
"full-name": "fullname",
|
| 103 |
+
"is_pro": "isPro",
|
| 104 |
+
"ispro": "isPro",
|
| 105 |
+
"pro": "isPro",
|
| 106 |
+
}
|
| 107 |
+
|
| 108 |
+
_ACTOR_FIELD_ALIASES: dict[str, str] = {
|
| 109 |
+
**_USER_FIELD_ALIASES,
|
| 110 |
+
"entity_type": "type",
|
| 111 |
+
"user_type": "type",
|
| 112 |
+
"actor_type": "type",
|
| 113 |
+
}
|
| 114 |
+
|
| 115 |
# Extra hf_repo_search kwargs intentionally supported as pass-through to
|
| 116 |
# huggingface_hub.HfApi.list_models/list_datasets/list_spaces.
|
| 117 |
# (Generic args like `query/search/sort/author/limit` are handled directly in
|
|
|
|
| 210 |
"hf_user_followers": {"mode": "exhaustive", "scan_max": FOLLOWERS_SCAN_MAX, "default_return": 1_000},
|
| 211 |
"hf_user_following": {"mode": "exhaustive", "scan_max": FOLLOWERS_SCAN_MAX, "default_return": 1_000},
|
| 212 |
"hf_org_members": {"mode": "exhaustive", "scan_max": FOLLOWERS_SCAN_MAX, "default_return": 1_000},
|
| 213 |
+
"hf_repo_likers": {"mode": "exhaustive", "default_return": 1_000},
|
| 214 |
"hf_user_likes": {
|
| 215 |
"mode": "exhaustive",
|
| 216 |
"scan_max": LIKES_SCAN_MAX,
|
|
|
|
| 237 |
"hf_repo_search",
|
| 238 |
"hf_user_summary",
|
| 239 |
"hf_user_graph",
|
| 240 |
+
"hf_repo_likers",
|
| 241 |
"hf_user_likes",
|
| 242 |
"hf_recent_activity",
|
| 243 |
"hf_repo_discussions",
|
|
|
|
| 264 |
r"^/api/users/[^/]+/followers$",
|
| 265 |
r"^/api/users/[^/]+/following$",
|
| 266 |
r"^/api/users/[^/]+/likes$",
|
| 267 |
+
r"^/api/(models|datasets|spaces)/(?:[^/]+|[^/]+/[^/]+)/likers$",
|
| 268 |
r"^/api/organizations/[^/]+/overview$",
|
| 269 |
r"^/api/organizations/[^/]+/members$",
|
| 270 |
r"^/api/organizations/[^/]+/followers$",
|
|
|
|
| 282 |
r"^/api/whoami-v2$",
|
| 283 |
r"^/api/trending$",
|
| 284 |
r"^/api/daily_papers$",
|
| 285 |
+
r"^/api/(models|datasets|spaces)/(?:[^/]+|[^/]+/[^/]+)/likers$",
|
| 286 |
r"^/api/collections$",
|
| 287 |
r"^/api/collections/[^/]+$",
|
| 288 |
r"^/api/collections/[^/]+/[^/]+$",
|
|
|
|
| 979 |
return "More results may exist; narrow filters or raise scan/page bounds for better coverage"
|
| 980 |
return "Ask for a larger limit to see more rows"
|
| 981 |
|
| 982 |
+
def _resolve_exhaustive_limits(
|
| 983 |
*,
|
| 984 |
+
return_limit: int | None,
|
| 985 |
+
limit: int | None,
|
| 986 |
+
count_only: bool,
|
| 987 |
+
default_return: int,
|
| 988 |
+
max_return: int,
|
| 989 |
+
scan_limit: int | None = None,
|
| 990 |
+
scan_cap: int | None = None,
|
| 991 |
) -> dict[str, Any]:
|
| 992 |
+
requested_return_limit = _resolve_requested_limit(return_limit, limit)
|
| 993 |
+
effective_requested_return_limit = 0 if count_only else requested_return_limit
|
| 994 |
+
out: dict[str, Any] = {
|
| 995 |
+
"requested_return_limit": requested_return_limit,
|
| 996 |
+
"applied_return_limit": _clamp_int(
|
| 997 |
+
effective_requested_return_limit,
|
| 998 |
+
default=default_return,
|
| 999 |
+
minimum=0,
|
| 1000 |
+
maximum=max_return,
|
| 1001 |
+
),
|
| 1002 |
+
"default_limit_used": requested_return_limit is None and not count_only,
|
| 1003 |
}
|
| 1004 |
+
out["hard_cap_applied"] = (
|
| 1005 |
+
requested_return_limit is not None and out["applied_return_limit"] < requested_return_limit
|
| 1006 |
+
)
|
| 1007 |
+
if scan_cap is not None:
|
| 1008 |
+
out["requested_scan_limit"] = scan_limit
|
| 1009 |
+
out["applied_scan_limit"] = _clamp_int(
|
| 1010 |
+
scan_limit,
|
| 1011 |
+
default=scan_cap,
|
| 1012 |
+
minimum=1,
|
| 1013 |
+
maximum=scan_cap,
|
| 1014 |
+
)
|
| 1015 |
+
return out
|
| 1016 |
|
| 1017 |
+
def _build_exhaustive_meta(
|
| 1018 |
+
*,
|
| 1019 |
+
base_meta: dict[str, Any],
|
| 1020 |
+
limit_plan: dict[str, Any],
|
| 1021 |
+
sample_complete: bool,
|
| 1022 |
+
exact_count: bool,
|
| 1023 |
+
truncated_by: str,
|
| 1024 |
+
more_available: bool | str,
|
| 1025 |
+
requested_max_pages: int | None = None,
|
| 1026 |
+
applied_max_pages: int | None = None,
|
| 1027 |
+
) -> dict[str, Any]:
|
| 1028 |
+
meta = dict(base_meta)
|
| 1029 |
+
applied_return_limit = int(limit_plan["applied_return_limit"])
|
| 1030 |
+
applied_scan_limit = limit_plan.get("applied_scan_limit")
|
| 1031 |
+
meta.update(
|
| 1032 |
+
{
|
| 1033 |
+
"complete": sample_complete,
|
| 1034 |
+
"exact_count": exact_count,
|
| 1035 |
+
"sample_complete": sample_complete,
|
| 1036 |
+
"more_available": more_available,
|
| 1037 |
+
"can_request_more": _derive_can_request_more(
|
| 1038 |
+
sample_complete=sample_complete,
|
| 1039 |
+
truncated_by=truncated_by,
|
| 1040 |
+
),
|
| 1041 |
+
"truncated_by": truncated_by,
|
| 1042 |
+
"next_request_hint": _derive_next_request_hint(
|
| 1043 |
+
truncated_by=truncated_by,
|
| 1044 |
+
more_available=more_available,
|
| 1045 |
+
applied_return_limit=applied_return_limit,
|
| 1046 |
+
applied_scan_limit=applied_scan_limit if isinstance(applied_scan_limit, int) else None,
|
| 1047 |
+
applied_max_pages=applied_max_pages,
|
| 1048 |
+
),
|
| 1049 |
+
}
|
| 1050 |
+
)
|
| 1051 |
+
meta.update(
|
| 1052 |
+
_derive_limit_metadata(
|
| 1053 |
+
requested_return_limit=limit_plan["requested_return_limit"],
|
| 1054 |
+
applied_return_limit=applied_return_limit,
|
| 1055 |
+
default_limit_used=bool(limit_plan["default_limit_used"]),
|
| 1056 |
+
requested_scan_limit=limit_plan.get("requested_scan_limit"),
|
| 1057 |
+
applied_scan_limit=applied_scan_limit if isinstance(applied_scan_limit, int) else None,
|
| 1058 |
+
requested_max_pages=requested_max_pages,
|
| 1059 |
+
applied_max_pages=applied_max_pages,
|
| 1060 |
+
)
|
| 1061 |
+
)
|
| 1062 |
+
return meta
|
| 1063 |
+
|
| 1064 |
+
def _helper_success(
|
| 1065 |
*,
|
| 1066 |
start_calls: int,
|
| 1067 |
source: str,
|
| 1068 |
items: list[dict[str, Any]],
|
|
|
|
| 1069 |
cursor: str | None = None,
|
| 1070 |
+
meta: dict[str, Any] | None = None,
|
| 1071 |
+
**extra_meta: Any,
|
| 1072 |
) -> dict[str, Any]:
|
| 1073 |
+
merged_meta = dict(meta or {})
|
| 1074 |
+
merged_meta.update(extra_meta)
|
| 1075 |
if cursor is not None:
|
| 1076 |
+
merged_meta["cursor"] = cursor
|
| 1077 |
+
|
| 1078 |
return {
|
| 1079 |
"ok": True,
|
| 1080 |
"item": items[0] if len(items) == 1 else None,
|
| 1081 |
"items": items,
|
| 1082 |
+
"meta": _helper_meta(start_calls, source=source, **merged_meta),
|
| 1083 |
"error": None,
|
| 1084 |
}
|
| 1085 |
|
|
|
|
| 1431 |
|
| 1432 |
default_return = _policy_int("hf_org_members", "default_return", 100)
|
| 1433 |
scan_cap = _policy_int("hf_org_members", "scan_max", FOLLOWERS_SCAN_MAX)
|
| 1434 |
+
limit_plan = _resolve_exhaustive_limits(
|
| 1435 |
+
return_limit=return_limit,
|
| 1436 |
+
limit=limit,
|
| 1437 |
+
count_only=count_only,
|
| 1438 |
+
default_return=default_return,
|
| 1439 |
+
max_return=MAX_EXHAUSTIVE_RETURN_ITEMS,
|
| 1440 |
+
scan_limit=scan_limit,
|
| 1441 |
+
scan_cap=scan_cap,
|
|
|
|
| 1442 |
)
|
| 1443 |
+
ret_lim = int(limit_plan["applied_return_limit"])
|
| 1444 |
+
scan_lim = int(limit_plan["applied_scan_limit"])
|
| 1445 |
+
hard_cap_applied = bool(limit_plan["hard_cap_applied"])
|
| 1446 |
has_where = isinstance(where, dict) and bool(where)
|
| 1447 |
|
| 1448 |
overview_total: int | None = None
|
|
|
|
| 1458 |
sample_complete = overview_total == 0
|
| 1459 |
more_available = False if sample_complete else True
|
| 1460 |
truncated_by = _derive_truncated_by(return_limit_hit=overview_total > 0)
|
| 1461 |
+
meta = _build_exhaustive_meta(
|
| 1462 |
+
base_meta={
|
| 1463 |
+
"scanned": 1,
|
| 1464 |
+
"matched": overview_total,
|
| 1465 |
+
"returned": 0,
|
| 1466 |
+
"total": overview_total,
|
| 1467 |
+
"total_available": overview_total,
|
| 1468 |
+
"total_matched": overview_total,
|
| 1469 |
+
"truncated": not sample_complete,
|
| 1470 |
+
"count_source": "overview",
|
| 1471 |
+
"organization": org,
|
| 1472 |
+
},
|
| 1473 |
+
limit_plan=limit_plan,
|
| 1474 |
+
sample_complete=sample_complete,
|
| 1475 |
+
exact_count=True,
|
| 1476 |
+
truncated_by=truncated_by,
|
| 1477 |
+
more_available=more_available,
|
| 1478 |
+
)
|
| 1479 |
+
return _helper_success(start_calls=start_calls, source=overview_source, items=[], meta=meta)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1480 |
|
| 1481 |
endpoint = f"/api/organizations/{org}/members"
|
| 1482 |
try:
|
|
|
|
| 1534 |
items = _project_items(
|
| 1535 |
items,
|
| 1536 |
fields,
|
| 1537 |
+
aliases=_USER_FIELD_ALIASES,
|
| 1538 |
+
)
|
| 1539 |
+
meta = _build_exhaustive_meta(
|
| 1540 |
+
base_meta={
|
| 1541 |
+
"scanned": observed_total,
|
| 1542 |
+
"matched": len(normalized),
|
| 1543 |
+
"returned": len(items),
|
| 1544 |
+
"total": total,
|
| 1545 |
+
"total_available": total_available,
|
| 1546 |
+
"total_matched": total_matched,
|
| 1547 |
+
"truncated": truncated,
|
| 1548 |
+
"count_source": count_source,
|
| 1549 |
+
"lower_bound": bool(has_where and not exact_count),
|
| 1550 |
+
"organization": org,
|
| 1551 |
},
|
| 1552 |
+
limit_plan=limit_plan,
|
| 1553 |
+
sample_complete=sample_complete,
|
| 1554 |
+
exact_count=exact_count,
|
| 1555 |
+
truncated_by=truncated_by,
|
| 1556 |
+
more_available=more_available,
|
| 1557 |
)
|
| 1558 |
+
return _helper_success(start_calls=start_calls, source=endpoint, items=items, meta=meta)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1559 |
|
| 1560 |
async def hf_repo_search(
|
| 1561 |
query: str | None = None,
|
|
|
|
| 1747 |
if not u:
|
| 1748 |
return _helper_error(start_calls=start_calls, source=f"/api/users/<u>/{kind}", error="username is required")
|
| 1749 |
|
| 1750 |
+
limit_plan = _resolve_exhaustive_limits(
|
| 1751 |
+
return_limit=return_limit,
|
| 1752 |
+
limit=limit,
|
| 1753 |
+
count_only=count_only,
|
| 1754 |
+
default_return=default_return,
|
| 1755 |
+
max_return=MAX_EXHAUSTIVE_RETURN_ITEMS,
|
| 1756 |
+
scan_limit=scan_limit,
|
| 1757 |
+
scan_cap=scan_cap,
|
|
|
|
| 1758 |
)
|
| 1759 |
+
ret_lim = int(limit_plan["applied_return_limit"])
|
| 1760 |
+
scan_lim = int(limit_plan["applied_scan_limit"])
|
| 1761 |
+
hard_cap_applied = bool(limit_plan["hard_cap_applied"])
|
| 1762 |
has_where = isinstance(where, dict) and bool(where)
|
| 1763 |
filtered = (pro_only is not None) or has_where
|
| 1764 |
|
|
|
|
| 1794 |
sample_complete = overview_total == 0
|
| 1795 |
more_available = False if sample_complete else True
|
| 1796 |
truncated_by = _derive_truncated_by(return_limit_hit=overview_total > 0)
|
| 1797 |
+
meta = _build_exhaustive_meta(
|
| 1798 |
+
base_meta={
|
| 1799 |
+
"scanned": 1,
|
| 1800 |
+
"matched": overview_total,
|
| 1801 |
+
"returned": 0,
|
| 1802 |
+
"total": overview_total,
|
| 1803 |
+
"total_available": overview_total,
|
| 1804 |
+
"total_matched": overview_total,
|
| 1805 |
+
"truncated": not sample_complete,
|
| 1806 |
+
"count_source": "overview",
|
| 1807 |
+
"relation": kind,
|
| 1808 |
+
"pro_only": pro_only,
|
| 1809 |
+
"where_applied": has_where,
|
| 1810 |
+
"entity": u,
|
| 1811 |
+
"entity_type": entity_type,
|
| 1812 |
+
"username": u,
|
| 1813 |
+
},
|
| 1814 |
+
limit_plan=limit_plan,
|
| 1815 |
+
sample_complete=sample_complete,
|
| 1816 |
+
exact_count=True,
|
| 1817 |
+
truncated_by=truncated_by,
|
| 1818 |
+
more_available=more_available,
|
| 1819 |
+
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1820 |
if entity_type == "organization":
|
| 1821 |
meta["organization"] = u
|
| 1822 |
+
return _helper_success(
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1823 |
start_calls=start_calls,
|
| 1824 |
source=overview_source,
|
| 1825 |
items=[],
|
|
|
|
| 1900 |
items = _project_items(
|
| 1901 |
items,
|
| 1902 |
fields,
|
| 1903 |
+
aliases=_USER_FIELD_ALIASES,
|
| 1904 |
+
)
|
| 1905 |
+
meta = _build_exhaustive_meta(
|
| 1906 |
+
base_meta={
|
| 1907 |
+
"scanned": observed_total,
|
| 1908 |
+
"matched": len(normalized),
|
| 1909 |
+
"returned": len(items),
|
| 1910 |
+
"total": total,
|
| 1911 |
+
"total_available": total_available,
|
| 1912 |
+
"total_matched": total_matched,
|
| 1913 |
+
"truncated": truncated,
|
| 1914 |
+
"count_source": count_source,
|
| 1915 |
+
"lower_bound": bool(filtered and not exact_count),
|
| 1916 |
+
"relation": kind,
|
| 1917 |
+
"pro_only": pro_only,
|
| 1918 |
+
"where_applied": has_where,
|
| 1919 |
+
"entity": u,
|
| 1920 |
+
"entity_type": entity_type,
|
| 1921 |
+
"username": u,
|
| 1922 |
},
|
| 1923 |
+
limit_plan=limit_plan,
|
| 1924 |
+
sample_complete=sample_complete,
|
| 1925 |
+
exact_count=exact_count,
|
| 1926 |
+
truncated_by=truncated_by,
|
| 1927 |
+
more_available=more_available,
|
| 1928 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1929 |
if entity_type == "organization":
|
| 1930 |
meta["organization"] = u
|
| 1931 |
+
return _helper_success(
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1932 |
start_calls=start_calls,
|
| 1933 |
source=endpoint,
|
| 1934 |
items=items,
|
|
|
|
| 2184 |
error="sort must be one of likedAt, repoLikes, repoDownloads",
|
| 2185 |
)
|
| 2186 |
|
| 2187 |
+
limit_plan = _resolve_exhaustive_limits(
|
| 2188 |
+
return_limit=return_limit,
|
| 2189 |
+
limit=limit,
|
| 2190 |
+
count_only=count_only,
|
| 2191 |
+
default_return=default_return,
|
| 2192 |
+
max_return=MAX_EXHAUSTIVE_RETURN_ITEMS,
|
| 2193 |
+
scan_limit=scan_limit,
|
| 2194 |
+
scan_cap=scan_cap,
|
|
|
|
| 2195 |
)
|
| 2196 |
+
ret_lim = int(limit_plan["applied_return_limit"])
|
| 2197 |
+
scan_lim = int(limit_plan["applied_scan_limit"])
|
| 2198 |
+
hard_cap_applied = bool(limit_plan["hard_cap_applied"])
|
| 2199 |
|
| 2200 |
allowed_repo_types: set[str] | None = None
|
| 2201 |
try:
|
|
|
|
| 2363 |
if scan_limit_hit:
|
| 2364 |
more_available = "unknown" if (allowed_repo_types is not None or where) else True
|
| 2365 |
|
| 2366 |
+
meta = _build_exhaustive_meta(
|
| 2367 |
+
base_meta={
|
| 2368 |
+
"scanned": len(scanned_rows),
|
| 2369 |
+
"matched": matched,
|
| 2370 |
+
"returned": len(items),
|
| 2371 |
+
"total": total,
|
| 2372 |
+
"total_available": len(payload),
|
| 2373 |
+
"total_matched": total_matched,
|
| 2374 |
+
"truncated": truncated,
|
| 2375 |
+
"count_source": "scan",
|
| 2376 |
+
"lower_bound": not exact_count,
|
| 2377 |
+
"enriched": enriched,
|
| 2378 |
+
"popularity_present": popularity_present,
|
| 2379 |
+
"sort_applied": sort_key,
|
| 2380 |
+
"ranking_window": effective_ranking_window,
|
| 2381 |
+
"ranking_complete": ranking_complete,
|
| 2382 |
+
"username": resolved_username,
|
| 2383 |
+
},
|
| 2384 |
+
limit_plan=limit_plan,
|
| 2385 |
+
sample_complete=sample_complete,
|
| 2386 |
+
exact_count=exact_count,
|
| 2387 |
+
truncated_by=truncated_by,
|
| 2388 |
+
more_available=more_available,
|
| 2389 |
+
)
|
| 2390 |
+
return _helper_success(
|
| 2391 |
+
start_calls=start_calls,
|
| 2392 |
+
source=endpoint,
|
| 2393 |
+
items=items,
|
| 2394 |
+
meta=meta,
|
| 2395 |
+
)
|
| 2396 |
+
|
| 2397 |
+
async def hf_repo_likers(
|
| 2398 |
+
repo_id: str,
|
| 2399 |
+
repo_type: str,
|
| 2400 |
+
return_limit: int | None = None,
|
| 2401 |
+
limit: int | None = None,
|
| 2402 |
+
count_only: bool = False,
|
| 2403 |
+
pro_only: bool | None = None,
|
| 2404 |
+
where: dict[str, Any] | None = None,
|
| 2405 |
+
fields: list[str] | None = None,
|
| 2406 |
+
) -> dict[str, Any]:
|
| 2407 |
+
start_calls = call_count["n"]
|
| 2408 |
+
rid = str(repo_id or "").strip()
|
| 2409 |
+
if not rid:
|
| 2410 |
+
return _helper_error(start_calls=start_calls, source="/api/repos/<repo>/likers", error="repo_id is required")
|
| 2411 |
+
|
| 2412 |
+
rt = _canonical_repo_type(repo_type, default="")
|
| 2413 |
+
if rt not in {"model", "dataset", "space"}:
|
| 2414 |
+
return _helper_error(
|
| 2415 |
+
start_calls=start_calls,
|
| 2416 |
+
source=f"/api/repos/{rid}/likers",
|
| 2417 |
+
error=f"Unsupported repo_type '{repo_type}'",
|
| 2418 |
+
repo_id=rid,
|
| 2419 |
+
)
|
| 2420 |
+
|
| 2421 |
+
default_return = _policy_int("hf_repo_likers", "default_return", 1_000)
|
| 2422 |
+
requested_return_limit = _resolve_requested_limit(return_limit, limit)
|
| 2423 |
+
default_limit_used = requested_return_limit is None and not count_only
|
| 2424 |
+
has_where = isinstance(where, dict) and bool(where)
|
| 2425 |
+
|
| 2426 |
+
endpoint = f"/api/{rt}s/{rid}/likers"
|
| 2427 |
+
resp = _host_raw_call(endpoint)
|
| 2428 |
+
if not resp.get("ok"):
|
| 2429 |
+
return _helper_error(
|
| 2430 |
+
start_calls=start_calls,
|
| 2431 |
+
source=endpoint,
|
| 2432 |
+
error=resp.get("error") or "repo likers fetch failed",
|
| 2433 |
+
repo_id=rid,
|
| 2434 |
+
repo_type=rt,
|
| 2435 |
+
)
|
| 2436 |
+
|
| 2437 |
+
payload = resp.get("data") if isinstance(resp.get("data"), list) else []
|
| 2438 |
+
normalized: list[dict[str, Any]] = []
|
| 2439 |
+
for row in payload:
|
| 2440 |
+
if not isinstance(row, dict):
|
| 2441 |
+
continue
|
| 2442 |
+
username = row.get("user") or row.get("username")
|
| 2443 |
+
if not isinstance(username, str) or not username:
|
| 2444 |
+
continue
|
| 2445 |
+
item = {
|
| 2446 |
+
"username": username,
|
| 2447 |
+
"fullname": row.get("fullname"),
|
| 2448 |
+
"type": row.get("type") if isinstance(row.get("type"), str) and row.get("type") else "user",
|
| 2449 |
+
"isPro": row.get("isPro"),
|
| 2450 |
+
}
|
| 2451 |
+
if pro_only is True and item.get("isPro") is not True:
|
| 2452 |
+
continue
|
| 2453 |
+
if pro_only is False and item.get("isPro") is True:
|
| 2454 |
+
continue
|
| 2455 |
+
if not _item_matches_where(item, where):
|
| 2456 |
+
continue
|
| 2457 |
+
normalized.append(item)
|
| 2458 |
+
|
| 2459 |
+
# /likers is a one-shot full-list endpoint: the Hub returns the liker rows in a
|
| 2460 |
+
# single response with no cursor/scan continuation. Keep the default output compact,
|
| 2461 |
+
# but do not apply the generic exhaustive hard cap here because it does not improve
|
| 2462 |
+
# upstream coverage or cost; the full liker set has already been fetched.
|
| 2463 |
+
if count_only:
|
| 2464 |
+
ret_lim = 0
|
| 2465 |
+
elif requested_return_limit is None:
|
| 2466 |
+
ret_lim = default_return
|
| 2467 |
+
else:
|
| 2468 |
+
try:
|
| 2469 |
+
ret_lim = max(0, int(requested_return_limit))
|
| 2470 |
+
except Exception:
|
| 2471 |
+
ret_lim = default_return
|
| 2472 |
+
limit_plan = {
|
| 2473 |
+
"requested_return_limit": requested_return_limit,
|
| 2474 |
+
"applied_return_limit": ret_lim,
|
| 2475 |
+
"default_limit_used": default_limit_used,
|
| 2476 |
+
"hard_cap_applied": False,
|
| 2477 |
}
|
| 2478 |
+
|
| 2479 |
+
matched = len(normalized)
|
| 2480 |
+
items = [] if count_only else normalized[:ret_lim]
|
| 2481 |
+
return_limit_hit = ret_lim > 0 and matched > ret_lim
|
| 2482 |
+
truncated_by = _derive_truncated_by(
|
| 2483 |
+
hard_cap=False,
|
| 2484 |
+
return_limit_hit=return_limit_hit,
|
| 2485 |
+
)
|
| 2486 |
+
sample_complete = matched <= ret_lim and (not count_only or matched == 0)
|
| 2487 |
+
truncated = truncated_by != "none"
|
| 2488 |
+
more_available = _derive_more_available(
|
| 2489 |
+
sample_complete=sample_complete,
|
| 2490 |
+
exact_count=True,
|
| 2491 |
+
returned=len(items),
|
| 2492 |
+
total=matched,
|
| 2493 |
+
)
|
| 2494 |
+
|
| 2495 |
+
items = _project_items(
|
| 2496 |
+
items,
|
| 2497 |
+
fields,
|
| 2498 |
+
aliases=_ACTOR_FIELD_ALIASES,
|
| 2499 |
+
)
|
| 2500 |
+
|
| 2501 |
+
meta = _build_exhaustive_meta(
|
| 2502 |
+
base_meta={
|
| 2503 |
+
"scanned": len(payload),
|
| 2504 |
+
"matched": matched,
|
| 2505 |
+
"returned": len(items),
|
| 2506 |
+
"total": matched,
|
| 2507 |
+
"total_available": len(payload),
|
| 2508 |
+
"total_matched": matched,
|
| 2509 |
+
"truncated": truncated,
|
| 2510 |
+
"count_source": "likers_list",
|
| 2511 |
+
"lower_bound": False,
|
| 2512 |
+
"repo_id": rid,
|
| 2513 |
+
"repo_type": rt,
|
| 2514 |
+
"pro_only": pro_only,
|
| 2515 |
+
"where_applied": has_where,
|
| 2516 |
+
"upstream_pagination": "none",
|
| 2517 |
+
},
|
| 2518 |
+
limit_plan=limit_plan,
|
| 2519 |
+
sample_complete=sample_complete,
|
| 2520 |
+
exact_count=True,
|
| 2521 |
+
truncated_by=truncated_by,
|
| 2522 |
+
more_available=more_available,
|
| 2523 |
+
)
|
| 2524 |
+
meta["hard_cap_applied"] = False
|
| 2525 |
+
return _helper_success(
|
| 2526 |
start_calls=start_calls,
|
| 2527 |
source=endpoint,
|
| 2528 |
items=items,
|
|
|
|
| 2563 |
if start_cursor is None:
|
| 2564 |
start_cursor = startCursor or cursor
|
| 2565 |
|
|
|
|
| 2566 |
requested_max_pages = max_pages
|
|
|
|
| 2567 |
|
| 2568 |
if isinstance(username, str) and username.strip():
|
| 2569 |
entity = username.strip()
|
|
|
|
| 2588 |
if not ent:
|
| 2589 |
return _helper_error(start_calls=start_calls, source="/api/recent-activity", error="entity is required")
|
| 2590 |
|
| 2591 |
+
limit_plan = _resolve_exhaustive_limits(
|
| 2592 |
+
return_limit=return_limit,
|
| 2593 |
+
limit=limit,
|
| 2594 |
+
count_only=count_only,
|
| 2595 |
+
default_return=default_return,
|
| 2596 |
+
max_return=MAX_EXHAUSTIVE_RETURN_ITEMS,
|
| 2597 |
)
|
| 2598 |
+
ret_lim = int(limit_plan["applied_return_limit"])
|
| 2599 |
page_lim = page_cap
|
| 2600 |
pages_lim = _clamp_int(requested_max_pages, default=pages_cap, minimum=1, maximum=pages_cap)
|
| 2601 |
+
hard_cap_applied = bool(limit_plan["hard_cap_applied"])
|
|
|
|
| 2602 |
|
| 2603 |
type_filter = {str(t).strip().lower() for t in (activity_types or []) if str(t).strip()}
|
| 2604 |
repo_filter = {_canonical_repo_type(t, default="") for t in (repo_types or []) if str(t).strip()}
|
|
|
|
| 2703 |
elif stopped_for_budget and not exact_count:
|
| 2704 |
more_available = "unknown"
|
| 2705 |
|
| 2706 |
+
meta = _build_exhaustive_meta(
|
| 2707 |
+
base_meta={
|
| 2708 |
+
"scanned": scanned,
|
| 2709 |
+
"matched": matched,
|
| 2710 |
+
"returned": len(items),
|
| 2711 |
+
"total": matched,
|
| 2712 |
+
"total_matched": matched,
|
| 2713 |
+
"pages": pages,
|
| 2714 |
+
"truncated": truncated,
|
| 2715 |
+
"count_source": "scan" if exact_count else "none",
|
| 2716 |
+
"lower_bound": not exact_count,
|
| 2717 |
+
"page_limit": page_lim,
|
| 2718 |
+
"stopped_for_budget": stopped_for_budget,
|
| 2719 |
+
"feed_type": ft,
|
| 2720 |
+
"entity": ent,
|
| 2721 |
+
},
|
| 2722 |
+
limit_plan=limit_plan,
|
| 2723 |
+
sample_complete=sample_complete,
|
| 2724 |
+
exact_count=exact_count,
|
| 2725 |
+
truncated_by=truncated_by,
|
| 2726 |
+
more_available=more_available,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2727 |
requested_max_pages=requested_max_pages,
|
| 2728 |
applied_max_pages=pages_lim,
|
| 2729 |
+
)
|
| 2730 |
+
return _helper_success(
|
| 2731 |
start_calls=start_calls,
|
| 2732 |
source="/api/recent-activity",
|
| 2733 |
items=items,
|
|
|
|
| 3140 |
"hf_repo_search": _collecting_wrapper("hf_repo_search", hf_repo_search),
|
| 3141 |
"hf_user_summary": _collecting_wrapper("hf_user_summary", hf_user_summary),
|
| 3142 |
"hf_user_graph": _collecting_wrapper("hf_user_graph", hf_user_graph),
|
| 3143 |
+
"hf_repo_likers": _collecting_wrapper("hf_repo_likers", hf_repo_likers),
|
| 3144 |
"hf_user_likes": _collecting_wrapper("hf_user_likes", hf_user_likes),
|
| 3145 |
"hf_recent_activity": _collecting_wrapper("hf_recent_activity", hf_recent_activity),
|
| 3146 |
"hf_repo_discussions": _collecting_wrapper("hf_repo_discussions", hf_repo_discussions),
|