razvan commited on
Commit
5b68ff9
·
verified ·
1 Parent(s): 822b4fb

Upload plugins/mlintern/commands/run.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. plugins/mlintern/commands/run.md +46 -0
plugins/mlintern/commands/run.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # /mlintern:run
2
+
3
+ Run an ML Intern task end-to-end.
4
+
5
+ ## Arguments
6
+
7
+ - `prompt` (required): A one-sentence description of the ML deliverable. Examples: "fine-tune Qwen3-4B for code completion on python-code-dataset", "benchmark sentence-transformers/all-MiniLM-L6-v2 on STS-B", "train a diffusion LoRA on my art dataset".
8
+ - `--model` (optional): LiteLLM model ID to use (e.g., `huggingface/openai/gpt-oss-120b`). Defaults to the environment default.
9
+ - `--background` (optional): Queue the task and return immediately. Check status later.
10
+ - `--status <job-id>` (optional): Check status of a background job.
11
+ - `--result <job-id>` (optional): Fetch the final report of a completed background job.
12
+ - `--cancel <job-id>` (optional): Cancel a running background job.
13
+
14
+ ## Workflow
15
+
16
+ 1. Clarify the deliverable from the prompt.
17
+ 2. Research the task before writing code:
18
+ - Search for landmark and recent papers if the task is novel.
19
+ - Read HF docs for current API patterns.
20
+ - Find a working implementation example.
21
+ 3. Validate inputs:
22
+ - Inspect dataset schema, splits, sample rows.
23
+ - Verify model repo exists, architecture matches, tokenizer available.
24
+ 4. Implement the smallest working version.
25
+ 5. Smoke test locally or in a small HF Job.
26
+ 6. Run the full training/evaluation job with HF Jobs.
27
+ 7. Evaluate results against the target.
28
+ 8. Save code, configs, and reports; publish ML artifacts to Hugging Face.
29
+
30
+ ## Output
31
+
32
+ Return:
33
+ - Deliverable status (complete / partial / failed).
34
+ - GitHub branch, commit, PR, or report path for code.
35
+ - Hugging Face model/dataset/Space URLs for published artifacts.
36
+ - Job ID and log URL for HF Jobs runs.
37
+ - Metrics and evaluation results when available.
38
+ - Known failures, compromises, and next recommended steps.
39
+
40
+ ## Guardrails
41
+
42
+ - Never silently substitute a dataset, model, or training method. Ask for approval if the original request is incompatible.
43
+ - Always set realistic timeouts for HF Jobs (at least 2 hours for real training).
44
+ - Always include `push_to_hub=True` and `hub_model_id` in training configs.
45
+ - Run one job first before launching sweeps or ablations.
46
+ - For OOM errors: reduce batch size and increase gradient accumulation, enable gradient checkpointing, or upgrade hardware. Do not change the requested method.