Spaces:

WolfDavid
/

vision-edge

Sleeping

App Files Files Community

vision-edge / README.md

WolfDavid

Initial deploy: MobileNetV3 Faster R-CNN object detection

844ee22 about 2 months ago

preview code

raw

history blame contribute delete

3.24 kB

	---
	title: Vision Edge
	emoji: 👁
	colorFrom: green
	colorTo: blue
	sdk: gradio
	sdk_version: 5.9.1
	python_version: "3.11"
	app_file: app.py
	pinned: false
	license: mit
	tags:
	- object-detection
	- computer-vision
	- mobilenetv3
	- faster-rcnn
	- edge-deployment
	short_description: Object detection with MobileNetV3 Faster R-CNN
	---

	# Vision Edge — Object Detection

	Real-time object detection using **torchvision's Faster R-CNN with
	MobileNetV3-Large FPN backbone**, pre-trained on COCO.

	## What This Demonstrates

	- Edge-friendly architecture — MobileNetV3 is designed for mobile and
	edge inference, with 8-10× fewer parameters than ResNet-50
	- Pre-trained on COCO — 91 classes including people, vehicles,
	animals, furniture, food, sports equipment
	- CPU-only inference — runs on HF's free tier without any GPU
	- Production export pipeline — the full source repo supports TFLite,
	ONNX, INT8 quantization, and Edge TPU deployment

	## How to Use

	1. Upload an image or pick an example
	2. Adjust the confidence threshold (default 0.5)
	3. Click "Run Detection"
	4. See annotated output with bounding boxes and per-detection confidence

	Inference latency on HF's CPU tier: ~0.5–2 seconds per image.

	## Architecture

	```
	┌─────────────────────────────────────┐
	│ Image Upload (PIL) │
	└──────────────┬──────────────────────┘
	│
	▼
	┌─────────────────────────────────────┐
	│ torchvision Transform │
	│ (resize, normalize) │
	└──────────────┬──────────────────────┘
	│
	▼
	┌─────────────────────────────────────┐
	│ MobileNetV3-Large FPN Backbone │
	│ (feature extraction) │
	└──────────────┬──────────────────────┘
	│
	▼
	┌─────────────────────────────────────┐
	│ Faster R-CNN Detection Head │
	│ (region proposals + classifier) │
	└──────────────┬──────────────────────┘
	│
	▼
	┌─────────────────────────────────────┐
	│ Annotated Image + Detections List │
	└─────────────────────────────────────┘
	```

	## Edge Deployment Path

	The full `vision-edge` pipeline in the source repo additionally supports:

	- TFLite export for Android / iOS
	- INT8 quantization with post-training calibration
	- Edge TPU compilation for Google Coral boards
	- ONNX export for any ML runtime

	## License

	MIT