Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -1,26 +1,107 @@
|
|
| 1 |
-
|
| 2 |
-
tags:
|
| 3 |
-
- ml-intern
|
| 4 |
-
---
|
| 5 |
|
| 6 |
-
|
| 7 |
|
| 8 |
-
|
| 9 |
-
## Generated by ML Intern
|
| 10 |
|
| 11 |
-
This
|
| 12 |
|
| 13 |
-
|
| 14 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
## Usage
|
| 17 |
|
| 18 |
-
|
| 19 |
-
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 20 |
|
| 21 |
-
model_id = 'razvan/ml-intern-codex-plugin'
|
| 22 |
-
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 23 |
-
model = AutoModelForCausalLM.from_pretrained(model_id)
|
| 24 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
|
| 26 |
-
|
|
|
|
| 1 |
+
# ML Intern Plugin for OpenAI Codex
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
+
Hugging Face ML Intern reimagined as an OpenAI Codex plugin. Research papers, inspect datasets and models, run training and evaluation on Hugging Face Jobs, and ship ML artifacts β all inside Codex.
|
| 4 |
|
| 5 |
+
## What This Is
|
|
|
|
| 6 |
|
| 7 |
+
This repository replicates the core functionality of [`huggingface/mlintern-plugin`](https://github.com/huggingface/mlintern-plugin) (a Claude Code plugin) for the **OpenAI Codex ecosystem**. It is a Codex plugin, not a Claude Code plugin, using the Codex Skills and Plugin format.
|
| 8 |
|
| 9 |
+
The original `mlintern-plugin` wraps the `ml-intern` CLI binary inside Claude Code with slash commands. This Codex plugin instead uses:
|
| 10 |
+
- **Skills** (`SKILL.md` files) that teach Codex how to use Hugging Face tools
|
| 11 |
+
- **Commands** (`commands/run.md`) for `/mlintern:run` task execution
|
| 12 |
+
- **Scripts** for operations the HF plugin doesn't directly expose (dataset inspection, paper research)
|
| 13 |
+
- The **Hugging Face Codex plugin** (`hugging-face` MCP) for direct API tools (`model_search`, `dataset_search`, `paper_search`, `hub_repo_details`, `hf_doc_search`, `hf_doc_fetch`, `hf_jobs`)
|
| 14 |
+
|
| 15 |
+
## Plugin Structure
|
| 16 |
+
|
| 17 |
+
```
|
| 18 |
+
plugins/mlintern/
|
| 19 |
+
βββ .codex-plugin/
|
| 20 |
+
β βββ plugin.json β Plugin manifest
|
| 21 |
+
βββ agents/
|
| 22 |
+
β βββ openai.yaml β UI metadata for Codex
|
| 23 |
+
βββ commands/
|
| 24 |
+
β βββ run.md β /mlintern:run command definition
|
| 25 |
+
βββ skills/
|
| 26 |
+
β βββ ml-intern-harness/ β Core ML Intern behavior (research, validate, implement, ship)
|
| 27 |
+
β βββ hf-model-search/ β Model discovery and validation
|
| 28 |
+
β βββ hf-dataset-search/ β Dataset discovery and schema inspection
|
| 29 |
+
β βββ hf-paper-search/ β Paper research (search, read, citations)
|
| 30 |
+
β βββ hf-docs/ β HF library documentation lookup
|
| 31 |
+
β βββ hf-jobs/ β Hugging Face cloud job submission and monitoring
|
| 32 |
+
βββ assets/ β Icon, logo, screenshots
|
| 33 |
+
```
|
| 34 |
+
|
| 35 |
+
## Installation
|
| 36 |
+
|
| 37 |
+
### Method 1: Local Install (Recommended for Development)
|
| 38 |
+
|
| 39 |
+
Clone this repo and link it into your Codex plugins directory:
|
| 40 |
+
|
| 41 |
+
```bash
|
| 42 |
+
git clone https://github.com/razvan/ml-intern-codex-plugin.git
|
| 43 |
+
cd ml-intern-codex-plugin
|
| 44 |
+
# Link or copy to Codex plugins directory
|
| 45 |
+
mkdir -p ~/.codex/plugins/
|
| 46 |
+
ln -s $(pwd)/plugins/mlintern ~/.codex/plugins/mlintern
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
### Method 2: Marketplace (Future)
|
| 50 |
+
|
| 51 |
+
Once OpenAI launches a public plugin marketplace, this plugin can be registered via `marketplace.json`.
|
| 52 |
+
|
| 53 |
+
## Dependencies
|
| 54 |
+
|
| 55 |
+
- OpenAI Codex (with the `hugging-face` MCP plugin enabled)
|
| 56 |
+
- `HF_TOKEN` environment variable for private/gated resources and Jobs
|
| 57 |
+
- Python 3.10+ for the helper scripts (dataset inspection, paper research)
|
| 58 |
|
| 59 |
## Usage
|
| 60 |
|
| 61 |
+
Once installed, invoke the plugin in Codex by typing:
|
|
|
|
| 62 |
|
|
|
|
|
|
|
|
|
|
| 63 |
```
|
| 64 |
+
/mlintern:run "fine-tune Qwen3-4B for code completion on my dataset"
|
| 65 |
+
```
|
| 66 |
+
|
| 67 |
+
Or use the skill name in your prompt:
|
| 68 |
+
|
| 69 |
+
```
|
| 70 |
+
Use ml-intern-harness to research DPO training recipes, find a suitable dataset, and implement a training script.
|
| 71 |
+
```
|
| 72 |
+
|
| 73 |
+
## Skills Reference
|
| 74 |
+
|
| 75 |
+
| Skill | Purpose | Key Tools |
|
| 76 |
+
|---|---|---|
|
| 77 |
+
| `ml-intern-harness` | Core autonomous ML workflow | Research, validate, implement, test, run, evaluate, ship |
|
| 78 |
+
| `hf-model-search` | Find and validate models | `_model_search`, `_hub_repo_details` |
|
| 79 |
+
| `hf-dataset-search` | Find and validate datasets | `_dataset_search`, `_hub_repo_details`, `inspect_dataset.py` |
|
| 80 |
+
| `hf-paper-search` | Research papers and extract recipes | `_paper_search`, `papers.py` (details, citations, resources) |
|
| 81 |
+
| `hf-docs` | Look up current HF library APIs | `_hf_doc_search`, `_hf_doc_fetch` |
|
| 82 |
+
| `hf-jobs` | Submit and monitor cloud jobs | `_hf_jobs` (run, uv, ps, logs, inspect, cancel) |
|
| 83 |
+
|
| 84 |
+
## Comparison to Original
|
| 85 |
+
|
| 86 |
+
| Feature | `huggingface/mlintern-plugin` (Claude Code) | This Codex Plugin |
|
| 87 |
+
|---|---|---|
|
| 88 |
+
| Platform | Claude Code | OpenAI Codex |
|
| 89 |
+
| Format | `.claude-plugin` manifest + companion script | `.codex-plugin` manifest + Skills |
|
| 90 |
+
| Interaction | Slash commands (`/mlintern:run`) | Slash commands + skill invocation |
|
| 91 |
+
| Agent Runtime | Spawns `ml-intern` CLI binary | Uses Codex's native agent loop + Skills |
|
| 92 |
+
| Paper Research | Built into `ml-intern` binary | `papers.py` script shim |
|
| 93 |
+
| Dataset Inspection | Built into `ml-intern` binary | `inspect_dataset.py` script shim |
|
| 94 |
+
| Job Submission | Built into `ml-intern` binary | `_hf_jobs` via Hugging Face Codex plugin |
|
| 95 |
+
| Sandbox | HF Space sandboxes | Codex local shell + `_hf_jobs` |
|
| 96 |
+
|
| 97 |
+
## Why This Exists
|
| 98 |
+
|
| 99 |
+
The original `huggingface/mlintern-plugin` is **Claude Code only** β it's a companion script that spawns the `ml-intern` CLI inside Claude Code sessions. There is no equivalent for Codex. The `huggingface/skills` repo provides general HF skills but not the full ML Intern harness. This plugin bridges the gap.
|
| 100 |
+
|
| 101 |
+
## Contributing
|
| 102 |
+
|
| 103 |
+
This is an early version. The biggest improvement would be testing the skill instructions in real Codex sessions and tightening the guardrails where the LLM deviates. PRs welcome.
|
| 104 |
+
|
| 105 |
+
## License
|
| 106 |
|
| 107 |
+
Apache-2.0
|