--- title: WebGPU Benchmark emoji: ๐Ÿงช colorFrom: indigo colorTo: gray sdk: static app_file: index.html pinned: false short_description: One-click browser benchmark for GGUF models on WebGPU + WASM hf_oauth: true hf_oauth_scopes: - write-discussions --- # WebGPU Benchmark โ€” One Click Open this page in any WebGPU-capable browser (Chrome, Safari Technology Preview, Firefox with WebGPU flags), pick the GGUF variants that fit on your device, click **Download**, then **Run**. Results are cached locally in OPFS (no re-downloads on reload). When signed into Hugging Face, you can submit your results as a PR to the leaderboard dataset. **Source**: this Space is auto-synced from the [main repo](https://github.com/abhijitramesh/webgpu-bench) on every push to `main`. ## What's measured Per-variant: prefill tokens/sec, decode tokens/sec, wall clock, optional CPU-vs-GPU token agreement. ## Feature caveats - **Granite 4.0 h-1b** variants need `SSM_SCAN` support in the vendored llama.cpp build โ€” the UI tags the family with a warning badge. - **Bonsai-1.7B Q1_0** needs `Q1_0` quantization support. The base (non-Q1_0) variant loads regardless. ## Privacy No data is sent anywhere unless you click **Submit to leaderboard dataset**, which pushes to the dataset configured in `site/js/run/config.js` (`js/run/config.js` on the Space). Models and logs stay in your browser.