Spaces:
Running
Running
File size: 1,518 Bytes
3402dce 42e1de5 016edcd 42e1de5 3402dce 016edcd 42e1de5 05f8531 371de6f 3402dce 42e1de5 016edcd 42e1de5 7c91d97 42e1de5 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | ---
title: Falcon-Perception-0.6B WebGPU
emoji: π¦
colorFrom: indigo
colorTo: pink
sdk: static
pinned: false
license: apache-2.0
short_description: Open-vocab detection + segmentation, all in the browser
models:
- tiiuae/Falcon-Perception
- onnx-community/falcon-perception-onnx-webgpu
---
# π¦
Falcon-Perception-0.6B WebGPU
A browser demo for **[tiiuae/Falcon-Perception](https://huggingface.co/tiiuae/Falcon-Perception)** β a 0.6B open-vocabulary VLM that turns natural-language queries into bounding boxes and pixel-accurate segmentation masks, running fully client-side via WebGPU + ONNX Runtime Web.
[](https://huggingface.co/tiiuae/Falcon-Perception)
[](https://huggingface.co/onnx-community/falcon-perception-onnx-webgpu)
## What's inside
- **Detection** β draw bounding boxes for any natural-language query ("athletes", "the runner in front", "mangoes").
- **Segmentation** β pixel-accurate masks via the AnyUp upsampler, all in-browser.
- **Tracker (preview)** β HUD-style reticles on video. Limited by VLM latency between detections; see the in-space disclaimer.
## How it runs
2.4 GB of ONNX weights are fetched once on first visit, then cached by your browser β no backend, no API keys, no network round-trip after load. Multi-threaded WASM is enabled via `coi-serviceworker`.
|