Spaces:

shreyask
/

falcon-perception

Running

readme: clickable model badge, sharper title + gradient

42e1de5 verified 22 days ago

1.52 kB

	---
	title: Falcon-Perception-0.6B WebGPU
	emoji: 🦅
	colorFrom: indigo
	colorTo: pink
	sdk: static
	pinned: false
	license: apache-2.0
	short_description: Open-vocab detection + segmentation, all in the browser
	models:
	- tiiuae/Falcon-Perception
	- onnx-community/falcon-perception-onnx-webgpu
	---

	# 🦅 Falcon-Perception-0.6B WebGPU

	A browser demo for [tiiuae/Falcon-Perception](https://huggingface.co/tiiuae/Falcon-Perception) — a 0.6B open-vocabulary VLM that turns natural-language queries into bounding boxes and pixel-accurate segmentation masks, running fully client-side via WebGPU + ONNX Runtime Web.

	[![Model](https://img.shields.io/badge/🤗%20Model-tiiuae%2FFalcon--Perception-yellow)](https://huggingface.co/tiiuae/Falcon-Perception)
	[![Weights](https://img.shields.io/badge/🤗%20ONNX%20Weights-onnx--community%2Ffalcon--perception--onnx--webgpu-blue)](https://huggingface.co/onnx-community/falcon-perception-onnx-webgpu)

	## What's inside

	- Detection — draw bounding boxes for any natural-language query ("athletes", "the runner in front", "mangoes").
	- Segmentation — pixel-accurate masks via the AnyUp upsampler, all in-browser.
	- Tracker (preview) — HUD-style reticles on video. Limited by VLM latency between detections; see the in-space disclaimer.

	## How it runs

	2.4 GB of ONNX weights are fetched once on first visit, then cached by your browser — no backend, no API keys, no network round-trip after load. Multi-threaded WASM is enabled via `coi-serviceworker`.