Spaces:

abhijitramesh
/

webgpu-bench

Running

webgpu-bench / README.md

GitHub Actions

sync from abhijitramesh/webgpu-bench@35f91a3bac

4aa88a2 18 days ago

1.39 kB

	---
	title: WebGPU Benchmark
	emoji: 🧪
	colorFrom: indigo
	colorTo: gray
	sdk: static
	app_file: index.html
	pinned: false
	short_description: One-click browser benchmark for GGUF models on WebGPU + WASM
	hf_oauth: true
	hf_oauth_scopes:
	- write-discussions
	---

	# WebGPU Benchmark — One Click

	Open this page in any WebGPU-capable browser (Chrome, Safari Technology Preview, Firefox with WebGPU flags), pick the GGUF variants that fit on your device, click Download, then Run.

	Results are cached locally in OPFS (no re-downloads on reload). When signed into Hugging Face, you can submit your results as a PR to the leaderboard dataset.

	Source: this Space is auto-synced from the [main repo](https://github.com/abhijitramesh/webgpu-bench) on every push to `main`.

	## What's measured
	Per-variant: prefill tokens/sec, decode tokens/sec, wall clock, optional CPU-vs-GPU token agreement.

	## Feature caveats
	- Granite 4.0 h-1b variants need `SSM_SCAN` support in the vendored llama.cpp build — the UI tags the family with a warning badge.
	- Bonsai-1.7B Q1_0 needs `Q1_0` quantization support. The base (non-Q1_0) variant loads regardless.

	## Privacy
	No data is sent anywhere unless you click Submit to leaderboard dataset, which pushes to the dataset configured in `site/js/run/config.js` (`js/run/config.js` on the Space). Models and logs stay in your browser.