Spaces:
Running
Running
metadata
title: WebGPU Benchmark
emoji: 🧪
colorFrom: indigo
colorTo: gray
sdk: static
app_file: index.html
pinned: false
short_description: One-click browser benchmark for GGUF models on WebGPU + WASM
hf_oauth: true
hf_oauth_scopes:
- write-discussions
WebGPU Benchmark — One Click
Open this page in any WebGPU-capable browser (Chrome, Safari Technology Preview, Firefox with WebGPU flags), pick the GGUF variants that fit on your device, click Download, then Run.
Results are cached locally in OPFS (no re-downloads on reload). When signed into Hugging Face, you can submit your results as a PR to the leaderboard dataset.
Source: this Space is auto-synced from the main repo on every push to main.
What's measured
Per-variant: prefill tokens/sec, decode tokens/sec, wall clock, optional CPU-vs-GPU token agreement.
Feature caveats
- Granite 4.0 h-1b variants need
SSM_SCANsupport in the vendored llama.cpp build — the UI tags the family with a warning badge. - Bonsai-1.7B Q1_0 needs
Q1_0quantization support. The base (non-Q1_0) variant loads regardless.
Privacy
No data is sent anywhere unless you click Submit to leaderboard dataset, which pushes to the dataset configured in site/js/run/config.js (js/run/config.js on the Space). Models and logs stay in your browser.