Upload plugins/mlintern/commands/run.md with huggingface_hub
Browse files
plugins/mlintern/commands/run.md
ADDED
|
@@ -0,0 +1,46 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# /mlintern:run
|
| 2 |
+
|
| 3 |
+
Run an ML Intern task end-to-end.
|
| 4 |
+
|
| 5 |
+
## Arguments
|
| 6 |
+
|
| 7 |
+
- `prompt` (required): A one-sentence description of the ML deliverable. Examples: "fine-tune Qwen3-4B for code completion on python-code-dataset", "benchmark sentence-transformers/all-MiniLM-L6-v2 on STS-B", "train a diffusion LoRA on my art dataset".
|
| 8 |
+
- `--model` (optional): LiteLLM model ID to use (e.g., `huggingface/openai/gpt-oss-120b`). Defaults to the environment default.
|
| 9 |
+
- `--background` (optional): Queue the task and return immediately. Check status later.
|
| 10 |
+
- `--status <job-id>` (optional): Check status of a background job.
|
| 11 |
+
- `--result <job-id>` (optional): Fetch the final report of a completed background job.
|
| 12 |
+
- `--cancel <job-id>` (optional): Cancel a running background job.
|
| 13 |
+
|
| 14 |
+
## Workflow
|
| 15 |
+
|
| 16 |
+
1. Clarify the deliverable from the prompt.
|
| 17 |
+
2. Research the task before writing code:
|
| 18 |
+
- Search for landmark and recent papers if the task is novel.
|
| 19 |
+
- Read HF docs for current API patterns.
|
| 20 |
+
- Find a working implementation example.
|
| 21 |
+
3. Validate inputs:
|
| 22 |
+
- Inspect dataset schema, splits, sample rows.
|
| 23 |
+
- Verify model repo exists, architecture matches, tokenizer available.
|
| 24 |
+
4. Implement the smallest working version.
|
| 25 |
+
5. Smoke test locally or in a small HF Job.
|
| 26 |
+
6. Run the full training/evaluation job with HF Jobs.
|
| 27 |
+
7. Evaluate results against the target.
|
| 28 |
+
8. Save code, configs, and reports; publish ML artifacts to Hugging Face.
|
| 29 |
+
|
| 30 |
+
## Output
|
| 31 |
+
|
| 32 |
+
Return:
|
| 33 |
+
- Deliverable status (complete / partial / failed).
|
| 34 |
+
- GitHub branch, commit, PR, or report path for code.
|
| 35 |
+
- Hugging Face model/dataset/Space URLs for published artifacts.
|
| 36 |
+
- Job ID and log URL for HF Jobs runs.
|
| 37 |
+
- Metrics and evaluation results when available.
|
| 38 |
+
- Known failures, compromises, and next recommended steps.
|
| 39 |
+
|
| 40 |
+
## Guardrails
|
| 41 |
+
|
| 42 |
+
- Never silently substitute a dataset, model, or training method. Ask for approval if the original request is incompatible.
|
| 43 |
+
- Always set realistic timeouts for HF Jobs (at least 2 hours for real training).
|
| 44 |
+
- Always include `push_to_hub=True` and `hub_model_id` in training configs.
|
| 45 |
+
- Run one job first before launching sweeps or ablations.
|
| 46 |
+
- For OOM errors: reduce batch size and increase gradient accumulation, enable gradient checkpointing, or upgrade hardware. Do not change the requested method.
|