Spaces:
Running
Running
| title: WebGPU Benchmark | |
| emoji: 🧪 | |
| colorFrom: indigo | |
| colorTo: gray | |
| sdk: static | |
| app_file: index.html | |
| pinned: false | |
| short_description: One-click browser benchmark for GGUF models on WebGPU + WASM | |
| hf_oauth: true | |
| hf_oauth_scopes: | |
| - write-discussions | |
| # WebGPU Benchmark — One Click | |
| Open this page in any WebGPU-capable browser (Chrome, Safari Technology Preview, Firefox with WebGPU flags), pick the GGUF variants that fit on your device, click **Download**, then **Run**. | |
| Results are cached locally in OPFS (no re-downloads on reload). When signed into Hugging Face, you can submit your results as a PR to the leaderboard dataset. | |
| **Source**: this Space is auto-synced from the [main repo](https://github.com/abhijitramesh/webgpu-bench) on every push to `main`. | |
| ## What's measured | |
| Per-variant: prefill tokens/sec, decode tokens/sec, wall clock, optional CPU-vs-GPU token agreement. | |
| ## Feature caveats | |
| - **Granite 4.0 h-1b** variants need `SSM_SCAN` support in the vendored llama.cpp build — the UI tags the family with a warning badge. | |
| - **Bonsai-1.7B Q1_0** needs `Q1_0` quantization support. The base (non-Q1_0) variant loads regardless. | |
| ## Privacy | |
| No data is sent anywhere unless you click **Submit to leaderboard dataset**, which pushes to the dataset configured in `site/js/run/config.js` (`js/run/config.js` on the Space). Models and logs stay in your browser. | |