Spaces:
Configuration error
Configuration error
| title: MiniCPM5 Pi Web Agent | |
| sdk: static | |
| app_file: dist/index.html | |
| fullWidth: true | |
| models: | |
| - Mike0021/MiniCPM5-1B-ONNX-Web | |
| custom_headers: | |
| cross-origin-embedder-policy: credentialless | |
| cross-origin-opener-policy: same-origin | |
| cross-origin-resource-policy: cross-origin | |
| # MiniCPM5-1B Pi Web Agent | |
| This workspace converts `openbmb/MiniCPM5-1B` into a browser-loadable Transformers.js model and ships a browser-only pi agent app. | |
| Published artifact: https://huggingface.co/Mike0021/MiniCPM5-1B-ONNX-Web | |
| The required runtime layout is: | |
| - `config.json`, `generation_config.json`, tokenizer files, and `chat_template.jinja` at the repo root | |
| - q4 ONNX weights at `onnx/model_q4.onnx` | |
| - `config.json` includes `transformers.js_config.dtype = "q4"` so the default loader selects the web-sized artifact | |
| The conversion uses an ONNX export with KV cache (`text-generation-with-past`) and then applies ONNX Runtime 4-bit MatMul quantization. A generic ONNX export without KV cache is not enough for normal Transformers.js autoregressive generation. | |
| ## Run the Web App | |
| ```bash | |
| npm install | |
| npm run dev | |
| ``` | |
| Open http://localhost:5173/. | |
| The app uses: | |
| - `@earendil-works/pi-agent-core` for the agent loop, transcript state, and tool execution. | |
| - `@huggingface/transformers` with `Mike0021/MiniCPM5-1B-ONNX-Web` for the local browser model. | |
| - `@webcontainer/api` for the client-only sandbox with a virtual filesystem and browser-contained Node.js processes. | |
| Vite serves the app with COOP/COEP headers and boots WebContainers with `coep: "credentialless"`. The deterministic test model is available at `http://localhost:5173/?mode=mock&device=wasm` for fast harness and sandbox smoke tests without downloading the full ONNX model. | |
| The Static Space uses the same isolation policy through `custom_headers` in this README frontmatter. The app is built with `npm run build` and the generated `dist/` directory is uploaded to the Space. | |
| ## Test the Agent App | |
| Start the dev server, then run: | |
| ```bash | |
| npm run smoke:web | |
| ``` | |
| The smoke test opens Chromium, confirms `crossOriginIsolated`, boots the WebContainer sandbox, runs the pi agent in deterministic mode, writes `hello.js`, spawns `node hello.js`, and checks for `pi sandbox result: 42` in the transcript. | |
| For the heavier end-to-end check with the real MiniCPM5 ONNX model in browser WASM mode: | |
| ```bash | |
| npm run smoke:local-model | |
| ``` | |
| This downloads/loads the q4 ONNX artifact in Chromium, runs the same pi/WebContainer task, and checks that the model reaches `Model ready` before the sandbox result is accepted. | |
| ## Verify the Published Artifact | |
| ```bash | |
| npm install | |
| node scripts/verify_tjs_model.mjs Mike0021/MiniCPM5-1B-ONNX-Web | |
| ``` | |
| The verifier asks Transformers.js for the `text-generation` file plan, checks for `onnx/model_q4.onnx`, then loads the model and generates a short completion. | |
| ## Convert and Upload | |
| The published repo was produced locally with a CPU fp16 export followed by q4 ONNX quantization: | |
| ```bash | |
| uv run --python 3.12 \ | |
| --with "numpy<2" \ | |
| --with "transformers==4.57.6" \ | |
| --with "optimum[onnx]" \ | |
| --with "onnxruntime==1.20.1" \ | |
| --with onnxslim \ | |
| --with "huggingface_hub>=0.33" \ | |
| --with accelerate \ | |
| --with sentencepiece \ | |
| --with protobuf \ | |
| scripts/convert_minicpm5_tjs.py \ | |
| --source-model openbmb/MiniCPM5-1B \ | |
| --target-repo Mike0021/MiniCPM5-1B-ONNX-Web \ | |
| --output-dir output/MiniCPM5-1B-ONNX-Web \ | |
| --work-dir output/minicpm5-work \ | |
| --device cpu \ | |
| --export-dtype fp16 | |
| ``` | |
| For a clean remote conversion, the same script can be run on Hugging Face Jobs with a configured Hub token: | |
| ```bash | |
| hf repos create Mike0021/MiniCPM5-1B-ONNX-Web --repo-type model --exist-ok | |
| hf jobs uv run scripts/convert_minicpm5_tjs.py \ | |
| --flavor l4x1 \ | |
| --timeout 6h \ | |
| --secrets HF_TOKEN \ | |
| --with "numpy<2" \ | |
| --with "transformers==4.57.6" \ | |
| --with "optimum[onnx]" \ | |
| --with "onnxruntime==1.20.1" \ | |
| --with onnxslim \ | |
| --with "huggingface_hub>=0.33" \ | |
| --with accelerate \ | |
| --with sentencepiece \ | |
| --with protobuf \ | |
| --python 3.12 \ | |
| -- \ | |
| --target-repo Mike0021/MiniCPM5-1B-ONNX-Web \ | |
| --export-dtype fp16 | |
| ``` | |