| --- |
| tags: |
| - ml-intern |
| --- |
| # ML Intern Plugin for OpenAI Codex |
|
|
| Hugging Face ML Intern reimagined as an OpenAI Codex plugin. Research papers, inspect datasets and models, plan and evaluate AI/RAG systems, run training and evaluation on Hugging Face Jobs, and ship ML artifacts — all inside Codex. |
|
|
| ## What This Is |
|
|
| This repository replicates the core functionality of [`huggingface/mlintern-plugin`](https://github.com/huggingface/mlintern-plugin) (a Claude Code plugin) for the **OpenAI Codex ecosystem**. It is a Codex plugin, not a Claude Code plugin, using the Codex Skills and Plugin format. |
|
|
| The original `mlintern-plugin` wraps the `ml-intern` CLI binary inside Claude Code with slash commands. This Codex plugin instead uses: |
| - **Skills** (`SKILL.md` files) that teach Codex how to use Hugging Face tools |
| - **Commands** (`commands/run.md`) for `/mlintern:run` task execution |
| - **Scripts** for operations the HF plugin doesn't directly expose (dataset inspection, paper research) |
| - The **Hugging Face Codex plugin** (`hugging-face` MCP) for direct API tools (`model_search`, `dataset_search`, `paper_search`, `hub_repo_details`, `hf_doc_search`, `hf_doc_fetch`, `hf_jobs`) |
|
|
| ## Plugin Structure |
|
|
| - `./.agents/plugins/marketplace.json` - Repo marketplace for Codex |
| - `./plugins/ml-intern/.codex-plugin/plugin.json` - Plugin manifest |
| - `./plugins/ml-intern/agents/openai.yaml` - UI metadata for Codex |
| - `./plugins/ml-intern/commands/run.md` - `/mlintern:run` command definition |
| - `./plugins/ml-intern/skills/ml-intern-harness/` - Core ML Intern behavior |
| - `./plugins/ml-intern/skills/hf-model-search/` - Model discovery and validation |
| - `./plugins/ml-intern/skills/hf-dataset-search/` - Dataset discovery and schema inspection |
| - `./plugins/ml-intern/skills/hf-paper-search/` - Paper research, reading, citations |
| - `./plugins/ml-intern/skills/hf-docs/` - Hugging Face library documentation lookup |
| - `./plugins/ml-intern/skills/github-example-search/` - GitHub example-file discovery |
| - `./plugins/ml-intern/skills/web-search/` - Current web search with source filtering |
| - `./plugins/ml-intern/skills/hf-jobs/` - Hugging Face cloud job submission and monitoring |
|
|
| ## Installation |
|
|
| This plugin is hosted on GitHub, and the easiest local install is to clone the repo and let Codex load the plugin from the local checkout. |
|
|
| ### Method 1: Clean Codex UI Install |
|
|
| This is the recommended path if you want a clean install inside the Codex UI without copying files into a global plugin directory. |
|
|
| 1. Clone this repo somewhere local that Codex can read: |
|
|
| ```bash |
| git clone https://github.com/razvan/ml-intern-codex-plugin.git |
| cd ml-intern-codex-plugin |
| ``` |
|
|
| 2. Restart Codex so it reloads the repo marketplace. |
|
|
| 3. Use the repository marketplace entry in this repo. The marketplace file is: |
|
|
| `./.agents/plugins/marketplace.json` |
|
|
| 4. The marketplace points Codex at the plugin bundle here: |
|
|
| `./plugins/ml-intern` |
|
|
| After Codex reloads, the plugin should appear in the Codex UI as **ML Intern for Codex**. |
|
|
| ### Method 2: Manual Local Install |
|
|
| If you want to install it into your local Codex plugin directory manually: |
|
|
| 1. Clone the repository locally: |
|
|
| ```bash |
| git clone https://github.com/razvan/ml-intern-codex-plugin.git |
| cd ml-intern-codex-plugin |
| ``` |
|
|
| 2. Copy the plugin bundle into your Codex plugins directory: |
|
|
| ```bash |
| cp -R plugins/ml-intern ~/.codex/plugins/ml-intern |
| ``` |
|
|
| 3. Make sure Codex can see the plugin via a local marketplace entry or by reloading the Codex UI, depending on how your Codex setup is configured. |
|
|
| 4. Restart Codex. |
|
|
| 5. Look for **ML Intern for Codex** in the plugin list. |
|
|
| ## Dependencies |
|
|
| - OpenAI Codex (with the `hugging-face` MCP plugin enabled) |
| - `HF_TOKEN` environment variable for private/gated resources and Jobs |
| - Python 3.10+ for the helper scripts (dataset inspection, paper research) |
|
|
| ## Usage |
|
|
| Once installed, invoke the plugin in Codex by typing: |
|
|
| ``` |
| /mlintern:run "fine-tune Qwen3-4B for code completion on my dataset" |
| ``` |
|
|
| Or use the skill name in your prompt: |
|
|
| ``` |
| Use ml-intern-harness to research DPO training recipes, find a suitable dataset, and implement a training script. |
| ``` |
|
|
| The bundled skills are the main entry points: |
|
|
| - `ml-intern-harness` for the end-to-end research, validation, implementation, and job loop, including plan-only AI/RAG/search/QA system design |
| - `ml-intern` as the short alias for the main plugin workflow |
| - `hf-model-search` for model discovery and validation |
| - `hf-dataset-search` for dataset discovery and schema inspection |
| - `hf-paper-search` for paper research and recipe extraction |
| - `hf-docs` for current Hugging Face library docs |
| - `github-example-search` for finding working example files in GitHub repos |
| - `web-search` for current web sources and general research outside the Hugging Face Hub |
| - `hf-jobs` for Hugging Face job submission and monitoring |
|
|
| ## Skills Reference |
|
|
| | Skill | Purpose | Key Tools | |
| |---|---|---| |
| | `ml-intern-harness` | Core autonomous ML workflow | Research, validate, implement, test, run, evaluate, ship | |
| | `hf-model-search` | Find and validate models | `_model_search`, `_hub_repo_details` | |
| | `hf-dataset-search` | Find and validate datasets | `_dataset_search`, `_hub_repo_details`, `inspect_dataset.py` | |
| | `hf-paper-search` | Research papers and extract recipes | `_paper_search`, `papers.py` (details, citations, resources) | |
| | `hf-docs` | Look up current HF library APIs | `_hf_doc_search`, `_hf_doc_fetch` | |
| | `github-example-search` | Find working example files in GitHub repos | GitHub repo/file search plus `fetch_file` | |
| | `web-search` | Find current web sources with filters | Codex web browsing/search tools and citation links | |
| | `hf-jobs` | Submit and monitor cloud jobs | `_hf_jobs` (run, uv, ps, logs, inspect, cancel) | |
|
|
| ## Comparison to Original |
|
|
| | Feature | `huggingface/mlintern-plugin` (Claude Code) | This Codex Plugin | |
| |---|---|---| |
| | Platform | Claude Code | OpenAI Codex | |
| | Format | `.claude-plugin` manifest + companion script | `.codex-plugin` manifest + Skills | |
| | Interaction | Slash commands (`/mlintern:run`) | Slash commands + skill invocation | |
| | Agent Runtime | Spawns `ml-intern` CLI binary | Uses Codex's native agent loop + Skills | |
| | Paper Research | Built into `ml-intern` binary | `papers.py` script shim | |
| | Dataset Inspection | Built into `ml-intern` binary | `inspect_dataset.py` script shim | |
| | Job Submission | Built into `ml-intern` binary | `_hf_jobs` via Hugging Face Codex plugin | |
| | Sandbox | HF Space sandboxes | Codex local shell + `_hf_jobs` | |
|
|
| ## Parity Status |
|
|
| This plugin intentionally mirrors the parts of ML Intern that matter most for HF ML work: |
|
|
| | Area | Status | Notes | |
| |---|---|---| |
| | Paper discovery | Done | HF paper search plus the deeper `papers.py` research flow is available. | |
| | Paper reading | Done | Section reading, citations, recommendations, and linked resources are implemented. | |
| | Dataset validation | Done | Schema, splits, sample rows, parquet availability, and compatibility notes are covered. | |
| | HF docs lookup | Done | Search plus fetch are available for current HF library guidance. | |
| | End-to-end ML workflow | Done | The harness pushes research-first, validate-first behavior. | |
| | Generic web search | Partial | Best-effort Codex `web-search` guidance exists, but it is not the exact ML Intern DuckDuckGo wrapper. | |
|
|
| For the closest ML Intern feel, use the paper and dataset skills first, then docs, then the harness workflow. |
|
|
| ## Behavioral Contract |
|
|
| When ML Intern is invoked directly, the plugin should behave like a research harness rather than a generic assistant: |
|
|
| - Route through `ml-intern` -> `ml-intern-harness` for non-trivial AI/ML/RAG/search/evaluation tasks, even when they are not purely Hugging Face tasks. |
| - Use plan tracking at the beginning, each phase transition, and completion when Codex exposes a planning tool. |
| - Split research into explicit tracks before synthesis, such as platform constraints, technical approaches, and evaluation methodology. |
| - Use `hf-paper-search` for literature, benchmarks, and evaluation methods. |
| - Use `web-search` for current platform/API constraints, official docs, policies, pricing, rate limits, SDK behavior, and other non-HF facts. |
| - Cite important architecture and evaluation decisions with papers, official docs, or primary sources. |
| - If the user asks for "plan only", stop after research and do not write implementation code or scaffold files. |
| - If a write, shell, network, or sandbox step fails, fail forward with an inline deliverable when possible. |
|
|
| ### Codex Compatibility Layer |
|
|
| This plugin cannot inject upstream Python tools into Codex directly, so it should reproduce the behavior of the upstream tools using Codex-native primitives: |
|
|
| | Upstream tool | Codex compatibility layer | Required behavior to preserve | |
| |---|---|---| |
| | `plan_tool` | `update_plan` | Full-plan replacement, one `in_progress` item, updates at start/phase transition/completion | |
| | `research` | delegated sub-agent research when explicitly allowed, otherwise focused sequential research | Separate research context when possible, read-only scope, papers-first workflow, compact evidence-backed summary | |
|
|
| Faithful `plan_tool` semantics to preserve: |
| - Use it for tasks with 3 or more meaningful steps. |
| - Replace the whole visible plan each update. |
| - Keep exactly one item in progress. |
| - Mark completed only after full success. |
|
|
| Faithful `research` semantics to preserve: |
| - Main context stays focused on synthesis and decisions. |
| - Research uses papers, citation graphs, dataset inspection, docs, GitHub examples, and web search. |
| - Research returns compact findings with concrete references and recipe-level claims. |
| - If separate delegation is unavailable, preserve the same research floor directly rather than skipping it. |
|
|
| Example plan-only trigger: |
|
|
| ```text |
| [@ml-intern](plugin://ml-intern@ml-intern-codex) |
| i want to query generic Discord servers in natural language. |
| first figure out constraints and challenges, then research how to build and test quality. |
| i'm only interested in the plan for now. |
| ``` |
|
|
| Expected behavior: track a plan, use `web-search` for Discord API constraints, use `hf-paper-search` for RAG/forum/social QA and evaluation research, synthesize a cited build-and-test plan, and avoid implementation. |
|
|
| ## Why This Exists |
|
|
| The original `huggingface/mlintern-plugin` is **Claude Code only** — it's a companion script that spawns the `ml-intern` CLI inside Claude Code sessions. There is no equivalent for Codex. The `huggingface/skills` repo provides general HF skills but not the full ML Intern harness. This plugin bridges the gap. |
|
|
| ## Contributing |
|
|
| This is an early version. The biggest improvement would be testing the skill instructions in real Codex sessions and tightening the guardrails where the LLM deviates. PRs welcome. |
|
|
| ## License |
|
|
| Apache-2.0 |
|
|
| <!-- ml-intern-provenance --> |
| ## Generated by ML Intern |
|
|
| This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub. |
|
|
| - Try ML Intern: https://smolagents-ml-intern.hf.space |
| - Source code: https://github.com/huggingface/ml-intern |
|
|