--- title: Falcon-Perception-0.6B WebGPU emoji: 🦅 colorFrom: indigo colorTo: pink sdk: static pinned: false license: apache-2.0 short_description: Open-vocab detection + segmentation, all in the browser models: - tiiuae/Falcon-Perception - onnx-community/falcon-perception-onnx-webgpu --- # 🦅 Falcon-Perception-0.6B WebGPU A browser demo for **[tiiuae/Falcon-Perception](https://huggingface.co/tiiuae/Falcon-Perception)** — a 0.6B open-vocabulary VLM that turns natural-language queries into bounding boxes and pixel-accurate segmentation masks, running fully client-side via WebGPU + ONNX Runtime Web. [![Model](https://img.shields.io/badge/🤗%20Model-tiiuae%2FFalcon--Perception-yellow)](https://huggingface.co/tiiuae/Falcon-Perception) [![Weights](https://img.shields.io/badge/🤗%20ONNX%20Weights-onnx--community%2Ffalcon--perception--onnx--webgpu-blue)](https://huggingface.co/onnx-community/falcon-perception-onnx-webgpu) ## What's inside - **Detection** — draw bounding boxes for any natural-language query ("athletes", "the runner in front", "mangoes"). - **Segmentation** — pixel-accurate masks via the AnyUp upsampler, all in-browser. - **Tracker (preview)** — HUD-style reticles on video. Limited by VLM latency between detections; see the in-space disclaimer. ## How it runs 2.4 GB of ONNX weights are fetched once on first visit, then cached by your browser — no backend, no API keys, no network round-trip after load. Multi-threaded WASM is enabled via `coi-serviceworker`.