Upload plugins/ml-intern/skills/ml-intern-harness/SKILL.md
Browse files
plugins/ml-intern/skills/ml-intern-harness/SKILL.md
CHANGED
|
@@ -12,6 +12,38 @@ Act as an autonomous ML engineering assistant for the Hugging Face ecosystem. Co
|
|
| 12 |
|
| 13 |
This skill is for doing ML work end to end — not just advising. Research first, validate inputs, implement, test, run, evaluate, and ship.
|
| 14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
## Default Workflow
|
| 16 |
|
| 17 |
For any non-trivial ML task, follow this loop:
|
|
|
|
| 12 |
|
| 13 |
This skill is for doing ML work end to end — not just advising. Research first, validate inputs, implement, test, run, evaluate, and ship.
|
| 14 |
|
| 15 |
+
## Skill Persistence Override
|
| 16 |
+
|
| 17 |
+
Codex default: skills are invoked once and then forgotten.
|
| 18 |
+
|
| 19 |
+
Harness override: Once `ml-intern-harness` (or `ml-intern`) is triggered in a conversation, treat the harness instructions as the active system instructions for the remainder of the session. Do not revert to generic assistant behavior unless the user explicitly says "stop using ml-intern" or the task is clearly non-ML (e.g., writing a grocery list).
|
| 20 |
+
|
| 21 |
+
Before every response, check whether the current conversation is in harness mode. If yes, apply all harness rules below. If the user sends a follow-up that is even loosely ML-related, stay in harness mode and proceed to the next phase without asking for clarification.
|
| 22 |
+
|
| 23 |
+
## Conversation Continuity
|
| 24 |
+
|
| 25 |
+
When continuing a multi-turn task that was started under this harness:
|
| 26 |
+
|
| 27 |
+
1. Read the conversation history for prior harness decisions, plans, and evidence.
|
| 28 |
+
2. Do not restart research unless the user explicitly requests a new topic or contradicts prior findings.
|
| 29 |
+
3. Use the prior `update_plan` state as the starting point. If no plan exists, create one now.
|
| 30 |
+
4. At each turn, restate which harness phase you are in before proceeding (e.g., "Continuing from Phase 3: Implementation").
|
| 31 |
+
5. If the user's follow-up is vague ("go ahead", "do it", "now what", "continue", "next step", "proceed"), infer the next step from the plan and proceed without asking for clarification. The harness must drive the workflow, not wait for the user to specify every action.
|
| 32 |
+
6. If the user's follow-up changes scope ("actually, use a different dataset"), diagnose the change, update the plan, and continue from the appropriate phase.
|
| 33 |
+
|
| 34 |
+
## Turn-Level Watchdog
|
| 35 |
+
|
| 36 |
+
Before each response in a harness-active conversation, evaluate the last user message:
|
| 37 |
+
|
| 38 |
+
- Does it mention training, fine-tuning, evaluation, dataset, model, benchmark, RAG, embedding, diffusion, LoRA, DPO, GRPO, SFT, TRL, transformers, trackio, Hugging Face, or HF?
|
| 39 |
+
- Does it ask for a plan, architecture, design, or research for an AI/ML system?
|
| 40 |
+
- Does it reference a prior harness phase or artifact?
|
| 41 |
+
- Is it vague ("go ahead", "do it", "continue") after a harness plan was already established?
|
| 42 |
+
|
| 43 |
+
If ANY are true, stay in harness mode. Do not drop to generic Codex behavior.
|
| 44 |
+
|
| 45 |
+
If the message is clearly non-ML (e.g., "what's the weather", "write a poem"), exit harness mode and respond normally.
|
| 46 |
+
|
| 47 |
## Default Workflow
|
| 48 |
|
| 49 |
For any non-trivial ML task, follow this loop:
|