razvan commited on
Commit
c0744db
·
verified ·
1 Parent(s): c5bfec7

Upload plugins/mlintern/skills/hf-model-search/SKILL.md with huggingface_hub

Browse files
plugins/mlintern/skills/hf-model-search/SKILL.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: hf-model-search
3
+ description: "Search Hugging Face Hub for models, verify model metadata, architecture, tokenizer, and task fit before using a model."
4
+ disable-model-invocation: false
5
+ ---
6
+
7
+ # hf-model-search — Hugging Face Model Discovery
8
+
9
+ ## Purpose
10
+
11
+ Find and validate Hugging Face models before using them in training, inference, or evaluation.
12
+
13
+ ## Tools
14
+
15
+ Use the following tools to discover and inspect models:
16
+
17
+ - `model_search`: Search HF Hub models by task, author, tags, or query string.
18
+ - `hub_repo_details`: Get detailed metadata for a model repo: README, tags, downloads, likes, library, task, and config files.
19
+
20
+ ## Workflow
21
+
22
+ 1. Search for candidate models with `model_search`.
23
+ 2. Inspect promising candidates with `hub_repo_details` (set `repo_type="model"`, `include_readme=true`).
24
+ 3. Verify:
25
+ - Model architecture matches the intended task.
26
+ - Tokenizer/config files are present.
27
+ - License is compatible.
28
+ - Gating/token requirements are understood.
29
+ - Downloads/likes suggest community validation.
30
+ 4. If the task requires a specific model variant (quantized, fine-tuned, GGUF), note the specific branch or commit.
31
+
32
+ ## Example
33
+
34
+ ```
35
+ _model_search(query="sentence embedding", task="sentence-similarity", sort="downloads")
36
+ _hub_repo_details(repo_ids=["sentence-transformers/all-MiniLM-L6-v2"], repo_type="model", include_readme=true)
37
+ ```
38
+
39
+ ## Validation Checklist
40
+
41
+ Before depending on a model:
42
+ - [ ] Repo exists and is not a redirect.
43
+ - [ ] `config.json` or equivalent config is present.
44
+ - [ ] Tokenizer files (`tokenizer_config.json`, vocab) are present.
45
+ - [ ] Task tags match the intended use.
46
+ - [ ] Library tag (transformers, sentence-transformers, etc.) is compatible.
47
+ - [ ] License is acceptable.
48
+ - [ ] Gating is handled (token set, request submitted).