metadata
license: mit
base_model: microsoft/Phi-4-mini-reasoning
tags:
- phi-4
- webgpu
- browser-inference
- strix-halo
- amd
- unified-memory
- reasoning
- math
- chain-of-thought
pipeline_tag: text-generation
Phi-4-mini-reasoning on WebGPU
First WebGPU package for Microsoft's Phi-4-mini-reasoning model.
The reasoning variant (not instruct) trained on DeepSeek-R1 chain-of-thought distillation. 3.8B params, 2.4 GB Q4_K_M. Runs entirely in browser via WebGPU + wllama.
Quick Start
- Download Q4_K_M GGUF from bartowski
- Place in model_splits/ (single file, no splitting needed)
- node serve.js (port 8190)
- Open http://localhost:8190 in Chrome
Hardware
Tested on GMKTEC EVO-X2 (AMD Strix Halo). Works on any WebGPU-capable device with 3+ GB available memory.
Credits
Built by Joshua (LJTSG) and Claude. Co-Authored-By: Claude noreply@anthropic.com