ML Intern Plugin for OpenAI Codex

Hugging Face ML Intern reimagined as an OpenAI Codex plugin. Research papers, inspect datasets and models, plan and evaluate AI/RAG systems, run training and evaluation on Hugging Face Jobs, and ship ML artifacts — all inside Codex.

What This Is

This repository replicates the core functionality of huggingface/mlintern-plugin (a Claude Code plugin) for the OpenAI Codex ecosystem. It is a Codex plugin, not a Claude Code plugin, using the Codex Skills and Plugin format.

The original mlintern-plugin wraps the ml-intern CLI binary inside Claude Code with slash commands. This Codex plugin instead uses:

Skills (SKILL.md files) that teach Codex how to use Hugging Face tools
Commands (commands/run.md) for /mlintern:run task execution
Scripts for operations the HF plugin doesn't directly expose (dataset inspection, paper research)
The Hugging Face Codex plugin (hugging-face MCP) for direct API tools (model_search, dataset_search, paper_search, hub_repo_details, hf_doc_search, hf_doc_fetch, hf_jobs)

Plugin Structure

./.agents/plugins/marketplace.json - Repo marketplace for Codex
./plugins/ml-intern/.codex-plugin/plugin.json - Plugin manifest
./plugins/ml-intern/agents/openai.yaml - UI metadata for Codex
./plugins/ml-intern/commands/run.md - /mlintern:run command definition
./plugins/ml-intern/skills/ml-intern-harness/ - Core ML Intern behavior
./plugins/ml-intern/skills/hf-model-search/ - Model discovery and validation
./plugins/ml-intern/skills/hf-dataset-search/ - Dataset discovery and schema inspection
./plugins/ml-intern/skills/hf-paper-search/ - Paper research, reading, citations
./plugins/ml-intern/skills/hf-docs/ - Hugging Face library documentation lookup
./plugins/ml-intern/skills/github-example-search/ - GitHub example-file discovery
./plugins/ml-intern/skills/web-search/ - Current web search with source filtering
./plugins/ml-intern/skills/hf-jobs/ - Hugging Face cloud job submission and monitoring

Installation

This plugin is hosted on GitHub, and the easiest local install is to clone the repo and let Codex load the plugin from the local checkout.

Method 1: Clean Codex UI Install

This is the recommended path if you want a clean install inside the Codex UI without copying files into a global plugin directory.

Clone this repo somewhere local that Codex can read:

git clone https://github.com/razvan/ml-intern-codex-plugin.git
cd ml-intern-codex-plugin

Restart Codex so it reloads the repo marketplace.
Use the repository marketplace entry in this repo. The marketplace file is:

./.agents/plugins/marketplace.json

The marketplace points Codex at the plugin bundle here:

./plugins/ml-intern

After Codex reloads, the plugin should appear in the Codex UI as ML Intern for Codex.

Method 2: Manual Local Install

If you want to install it into your local Codex plugin directory manually:

Clone the repository locally:

git clone https://github.com/razvan/ml-intern-codex-plugin.git
cd ml-intern-codex-plugin

Copy the plugin bundle into your Codex plugins directory:

cp -R plugins/ml-intern ~/.codex/plugins/ml-intern

Make sure Codex can see the plugin via a local marketplace entry or by reloading the Codex UI, depending on how your Codex setup is configured.
Restart Codex.
Look for ML Intern for Codex in the plugin list.

Dependencies

OpenAI Codex (with the hugging-face MCP plugin enabled)
HF_TOKEN environment variable for private/gated resources and Jobs
Python 3.10+ for the helper scripts (dataset inspection, paper research)

Usage

Once installed, invoke the plugin in Codex by typing:

/mlintern:run "fine-tune Qwen3-4B for code completion on my dataset"

Or use the skill name in your prompt:

Use ml-intern-harness to research DPO training recipes, find a suitable dataset, and implement a training script.

The bundled skills are the main entry points:

ml-intern-harness for the end-to-end research, validation, implementation, and job loop, including plan-only AI/RAG/search/QA system design
ml-intern as the short alias for the main plugin workflow
hf-model-search for model discovery and validation
hf-dataset-search for dataset discovery and schema inspection
hf-paper-search for paper research and recipe extraction
hf-docs for current Hugging Face library docs
github-example-search for finding working example files in GitHub repos
web-search for current web sources and general research outside the Hugging Face Hub
hf-jobs for Hugging Face job submission and monitoring

Skills Reference

Skill	Purpose	Key Tools
`ml-intern-harness`	Core autonomous ML workflow	Research, validate, implement, test, run, evaluate, ship
`hf-model-search`	Find and validate models	`_model_search`, `_hub_repo_details`
`hf-dataset-search`	Find and validate datasets	`_dataset_search`, `_hub_repo_details`, `inspect_dataset.py`
`hf-paper-search`	Research papers and extract recipes	`_paper_search`, `papers.py` (details, citations, resources)
`hf-docs`	Look up current HF library APIs	`_hf_doc_search`, `_hf_doc_fetch`
`github-example-search`	Find working example files in GitHub repos	GitHub repo/file search plus `fetch_file`
`web-search`	Find current web sources with filters	Codex web browsing/search tools and citation links
`hf-jobs`	Submit and monitor cloud jobs	`_hf_jobs` (run, uv, ps, logs, inspect, cancel)

Comparison to Original

Feature	`huggingface/mlintern-plugin` (Claude Code)	This Codex Plugin
Platform	Claude Code	OpenAI Codex
Format	`.claude-plugin` manifest + companion script	`.codex-plugin` manifest + Skills
Interaction	Slash commands (`/mlintern:run`)	Slash commands + skill invocation
Agent Runtime	Spawns `ml-intern` CLI binary	Uses Codex's native agent loop + Skills
Paper Research	Built into `ml-intern` binary	`papers.py` script shim
Dataset Inspection	Built into `ml-intern` binary	`inspect_dataset.py` script shim
Job Submission	Built into `ml-intern` binary	`_hf_jobs` via Hugging Face Codex plugin
Sandbox	HF Space sandboxes	Codex local shell + `_hf_jobs`

Parity Status

This plugin intentionally mirrors the parts of ML Intern that matter most for HF ML work:

Area	Status	Notes
Paper discovery	Done	HF paper search plus the deeper `papers.py` research flow is available.
Paper reading	Done	Section reading, citations, recommendations, and linked resources are implemented.
Dataset validation	Done	Schema, splits, sample rows, parquet availability, and compatibility notes are covered.
HF docs lookup	Done	Search plus fetch are available for current HF library guidance.
End-to-end ML workflow	Done	The harness pushes research-first, validate-first behavior.
Generic web search	Partial	Best-effort Codex `web-search` guidance exists, but it is not the exact ML Intern DuckDuckGo wrapper.

For the closest ML Intern feel, use the paper and dataset skills first, then docs, then the harness workflow.

Behavioral Contract

When ML Intern is invoked directly, the plugin should behave like a research harness rather than a generic assistant:

Route through ml-intern -> ml-intern-harness for non-trivial AI/ML/RAG/search/evaluation tasks, even when they are not purely Hugging Face tasks.
Use plan tracking at the beginning, each phase transition, and completion when Codex exposes a planning tool.
Split research into explicit tracks before synthesis, such as platform constraints, technical approaches, and evaluation methodology.
Use hf-paper-search for literature, benchmarks, and evaluation methods.
Use web-search for current platform/API constraints, official docs, policies, pricing, rate limits, SDK behavior, and other non-HF facts.
Cite important architecture and evaluation decisions with papers, official docs, or primary sources.
If the user asks for "plan only", stop after research and do not write implementation code or scaffold files.
If a write, shell, network, or sandbox step fails, fail forward with an inline deliverable when possible.

Codex Compatibility Layer

This plugin cannot inject upstream Python tools into Codex directly, so it should reproduce the behavior of the upstream tools using Codex-native primitives:

Upstream tool	Codex compatibility layer	Required behavior to preserve
`plan_tool`	`update_plan`	Full-plan replacement, one `in_progress` item, updates at start/phase transition/completion
`research`	delegated sub-agent research when explicitly allowed, otherwise focused sequential research	Separate research context when possible, read-only scope, papers-first workflow, compact evidence-backed summary

Faithful plan_tool semantics to preserve:

Use it for tasks with 3 or more meaningful steps.
Replace the whole visible plan each update.
Keep exactly one item in progress.
Mark completed only after full success.

Faithful research semantics to preserve:

Main context stays focused on synthesis and decisions.
Research uses papers, citation graphs, dataset inspection, docs, GitHub examples, and web search.
Research returns compact findings with concrete references and recipe-level claims.
If separate delegation is unavailable, preserve the same research floor directly rather than skipping it.

Example plan-only trigger:

[@ml-intern](plugin://ml-intern@ml-intern-codex)
i want to query generic Discord servers in natural language.
first figure out constraints and challenges, then research how to build and test quality.
i'm only interested in the plan for now.

Expected behavior: track a plan, use web-search for Discord API constraints, use hf-paper-search for RAG/forum/social QA and evaluation research, synthesize a cited build-and-test plan, and avoid implementation.

Why This Exists

The original huggingface/mlintern-plugin is Claude Code only — it's a companion script that spawns the ml-intern CLI inside Claude Code sessions. There is no equivalent for Codex. The huggingface/skills repo provides general HF skills but not the full ML Intern harness. This plugin bridges the gap.

Contributing

This is an early version. The biggest improvement would be testing the skill instructions in real Codex sessions and tightening the guardrails where the LLM deviates. PRs welcome.

License

Apache-2.0

Generated by ML Intern

This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.

Try ML Intern: https://smolagents-ml-intern.hf.space
Source code: https://github.com/huggingface/ml-intern

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support