razvan
/

ml-intern-codex-plugin

ml-intern

Model card Files Files and versions

xet

Community

razvan commited on 13 days ago

Commit

4b6f4c9

verified ·

1 Parent(s): 0ad25c6

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +97 -16

README.md CHANGED Viewed

@@ -1,26 +1,107 @@
----
-tags:
-- ml-intern
----
-# razvan/ml-intern-codex-plugin
-<!-- ml-intern-provenance -->
-## Generated by ML Intern
-This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub.
-- Try ML Intern: https://smolagents-ml-intern.hf.space
-- Source code: https://github.com/huggingface/ml-intern
 ## Usage
-```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-model_id = 'razvan/ml-intern-codex-plugin'
-tokenizer = AutoTokenizer.from_pretrained(model_id)
-model = AutoModelForCausalLM.from_pretrained(model_id)
 ```
-For non-causal architectures, replace `AutoModelForCausalLM` with the appropriate `AutoModel` class.

+# ML Intern Plugin for OpenAI Codex
+Hugging Face ML Intern reimagined as an OpenAI Codex plugin. Research papers, inspect datasets and models, run training and evaluation on Hugging Face Jobs, and ship ML artifacts — all inside Codex.
+## What This Is
+This repository replicates the core functionality of [`huggingface/mlintern-plugin`](https://github.com/huggingface/mlintern-plugin) (a Claude Code plugin) for the **OpenAI Codex ecosystem**. It is a Codex plugin, not a Claude Code plugin, using the Codex Skills and Plugin format.
+The original `mlintern-plugin` wraps the `ml-intern` CLI binary inside Claude Code with slash commands. This Codex plugin instead uses:
+- **Skills** (`SKILL.md` files) that teach Codex how to use Hugging Face tools
+- **Commands** (`commands/run.md`) for `/mlintern:run` task execution
+- **Scripts** for operations the HF plugin doesn't directly expose (dataset inspection, paper research)
+- The **Hugging Face Codex plugin** (`hugging-face` MCP) for direct API tools (`model_search`, `dataset_search`, `paper_search`, `hub_repo_details`, `hf_doc_search`, `hf_doc_fetch`, `hf_jobs`)
+## Plugin Structure
+```
+plugins/mlintern/
+├── .codex-plugin/
+│   └── plugin.json          ← Plugin manifest
+├── agents/
+│   └── openai.yaml          ← UI metadata for Codex
+├── commands/
+│   └── run.md               ← /mlintern:run command definition
+├── skills/
+│   ├── ml-intern-harness/   ← Core ML Intern behavior (research, validate, implement, ship)
+│   ├── hf-model-search/     ← Model discovery and validation
+│   ├── hf-dataset-search/   ← Dataset discovery and schema inspection
+│   ├── hf-paper-search/     ← Paper research (search, read, citations)
+│   ├── hf-docs/             ← HF library documentation lookup
+│   └── hf-jobs/             ← Hugging Face cloud job submission and monitoring
+└── assets/                    ← Icon, logo, screenshots
+```
+## Installation
+### Method 1: Local Install (Recommended for Development)
+Clone this repo and link it into your Codex plugins directory:
+```bash
+git clone https://github.com/razvan/ml-intern-codex-plugin.git
+cd ml-intern-codex-plugin
+# Link or copy to Codex plugins directory
+mkdir -p ~/.codex/plugins/
+ln -s $(pwd)/plugins/mlintern ~/.codex/plugins/mlintern
+```
+### Method 2: Marketplace (Future)
+Once OpenAI launches a public plugin marketplace, this plugin can be registered via `marketplace.json`.
+## Dependencies
+- OpenAI Codex (with the `hugging-face` MCP plugin enabled)
+- `HF_TOKEN` environment variable for private/gated resources and Jobs
+- Python 3.10+ for the helper scripts (dataset inspection, paper research)
 ## Usage
+Once installed, invoke the plugin in Codex by typing:
 ```
+/mlintern:run "fine-tune Qwen3-4B for code completion on my dataset"
+```
+Or use the skill name in your prompt:
+```
+Use ml-intern-harness to research DPO training recipes, find a suitable dataset, and implement a training script.
+```
+## Skills Reference
+| Skill | Purpose | Key Tools |
+|---|---|---|
+| `ml-intern-harness` | Core autonomous ML workflow | Research, validate, implement, test, run, evaluate, ship |
+| `hf-model-search` | Find and validate models | `_model_search`, `_hub_repo_details` |
+| `hf-dataset-search` | Find and validate datasets | `_dataset_search`, `_hub_repo_details`, `inspect_dataset.py` |
+| `hf-paper-search` | Research papers and extract recipes | `_paper_search`, `papers.py` (details, citations, resources) |
+| `hf-docs` | Look up current HF library APIs | `_hf_doc_search`, `_hf_doc_fetch` |
+| `hf-jobs` | Submit and monitor cloud jobs | `_hf_jobs` (run, uv, ps, logs, inspect, cancel) |
+## Comparison to Original
+| Feature | `huggingface/mlintern-plugin` (Claude Code) | This Codex Plugin |
+|---|---|---|
+| Platform | Claude Code | OpenAI Codex |
+| Format | `.claude-plugin` manifest + companion script | `.codex-plugin` manifest + Skills |
+| Interaction | Slash commands (`/mlintern:run`) | Slash commands + skill invocation |
+| Agent Runtime | Spawns `ml-intern` CLI binary | Uses Codex's native agent loop + Skills |
+| Paper Research | Built into `ml-intern` binary | `papers.py` script shim |
+| Dataset Inspection | Built into `ml-intern` binary | `inspect_dataset.py` script shim |
+| Job Submission | Built into `ml-intern` binary | `_hf_jobs` via Hugging Face Codex plugin |
+| Sandbox | HF Space sandboxes | Codex local shell + `_hf_jobs` |
+## Why This Exists
+The original `huggingface/mlintern-plugin` is **Claude Code only** — it's a companion script that spawns the `ml-intern` CLI inside Claude Code sessions. There is no equivalent for Codex. The `huggingface/skills` repo provides general HF skills but not the full ML Intern harness. This plugin bridges the gap.
+## Contributing
+This is an early version. The biggest improvement would be testing the skill instructions in real Codex sessions and tightening the guardrails where the LLM deviates. PRs welcome.
+## License
+Apache-2.0