Spaces:
Running
Running
| title: Falcon-Perception-0.6B WebGPU | |
| emoji: π¦ | |
| colorFrom: indigo | |
| colorTo: pink | |
| sdk: static | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: Open-vocab detection + segmentation, all in the browser | |
| models: | |
| - tiiuae/Falcon-Perception | |
| - onnx-community/falcon-perception-onnx-webgpu | |
| # π¦ Falcon-Perception-0.6B WebGPU | |
| A browser demo for **[tiiuae/Falcon-Perception](https://huggingface.co/tiiuae/Falcon-Perception)** β a 0.6B open-vocabulary VLM that turns natural-language queries into bounding boxes and pixel-accurate segmentation masks, running fully client-side via WebGPU + ONNX Runtime Web. | |
| [](https://huggingface.co/tiiuae/Falcon-Perception) | |
| [](https://huggingface.co/onnx-community/falcon-perception-onnx-webgpu) | |
| ## What's inside | |
| - **Detection** β draw bounding boxes for any natural-language query ("athletes", "the runner in front", "mangoes"). | |
| - **Segmentation** β pixel-accurate masks via the AnyUp upsampler, all in-browser. | |
| - **Tracker (preview)** β HUD-style reticles on video. Limited by VLM latency between detections; see the in-space disclaimer. | |
| ## How it runs | |
| 2.4 GB of ONNX weights are fetched once on first visit, then cached by your browser β no backend, no API keys, no network round-trip after load. Multi-threaded WASM is enabled via `coi-serviceworker`. | |