webgpu-bench / README.md
GitHub Actions
sync from abhijitramesh/webgpu-bench@35f91a3bac
4aa88a2
metadata
title: WebGPU Benchmark
emoji: 🧪
colorFrom: indigo
colorTo: gray
sdk: static
app_file: index.html
pinned: false
short_description: One-click browser benchmark for GGUF models on WebGPU + WASM
hf_oauth: true
hf_oauth_scopes:
  - write-discussions

WebGPU Benchmark — One Click

Open this page in any WebGPU-capable browser (Chrome, Safari Technology Preview, Firefox with WebGPU flags), pick the GGUF variants that fit on your device, click Download, then Run.

Results are cached locally in OPFS (no re-downloads on reload). When signed into Hugging Face, you can submit your results as a PR to the leaderboard dataset.

Source: this Space is auto-synced from the main repo on every push to main.

What's measured

Per-variant: prefill tokens/sec, decode tokens/sec, wall clock, optional CPU-vs-GPU token agreement.

Feature caveats

  • Granite 4.0 h-1b variants need SSM_SCAN support in the vendored llama.cpp build — the UI tags the family with a warning badge.
  • Bonsai-1.7B Q1_0 needs Q1_0 quantization support. The base (non-Q1_0) variant loads regardless.

Privacy

No data is sent anywhere unless you click Submit to leaderboard dataset, which pushes to the dataset configured in site/js/run/config.js (js/run/config.js on the Space). Models and logs stay in your browser.