| license: mit | |
| base_model: microsoft/Phi-4-mini-reasoning | |
| tags: | |
| - phi-4 | |
| - webgpu | |
| - browser-inference | |
| - strix-halo | |
| - amd | |
| - unified-memory | |
| - reasoning | |
| - math | |
| - chain-of-thought | |
| pipeline_tag: text-generation | |
| # Phi-4-mini-reasoning on WebGPU | |
| First WebGPU package for Microsoft's Phi-4-mini-reasoning model. | |
| The reasoning variant (not instruct) trained on DeepSeek-R1 chain-of-thought distillation. 3.8B params, 2.4 GB Q4_K_M. Runs entirely in browser via WebGPU + wllama. | |
| ## Quick Start | |
| 1. Download Q4_K_M GGUF from bartowski | |
| 2. Place in model_splits/ (single file, no splitting needed) | |
| 3. node serve.js (port 8190) | |
| 4. Open http://localhost:8190 in Chrome | |
| ## Hardware | |
| Tested on GMKTEC EVO-X2 (AMD Strix Halo). Works on any WebGPU-capable device with 3+ GB available memory. | |
| ## Credits | |
| Built by Joshua (LJTSG) and Claude. | |
| Co-Authored-By: Claude <noreply@anthropic.com> | |