Spaces:

WolfDavid
/

vision-edge

Sleeping

App Files Files Community

vision-edge / README.md

WolfDavid

Initial deploy: MobileNetV3 Faster R-CNN object detection

844ee22 about 2 months ago

preview code

raw

history blame contribute delete

3.24 kB

A newer version of the Gradio SDK is available: 6.16.0

Upgrade

metadata

title: Vision Edge
emoji: 👁
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 5.9.1
python_version: '3.11'
app_file: app.py
pinned: false
license: mit
tags:
  - object-detection
  - computer-vision
  - mobilenetv3
  - faster-rcnn
  - edge-deployment
short_description: Object detection with MobileNetV3 Faster R-CNN

Vision Edge — Object Detection

Real-time object detection using torchvision's Faster R-CNN with MobileNetV3-Large FPN backbone, pre-trained on COCO.

What This Demonstrates

Edge-friendly architecture — MobileNetV3 is designed for mobile and edge inference, with 8-10× fewer parameters than ResNet-50
Pre-trained on COCO — 91 classes including people, vehicles, animals, furniture, food, sports equipment
CPU-only inference — runs on HF's free tier without any GPU
Production export pipeline — the full source repo supports TFLite, ONNX, INT8 quantization, and Edge TPU deployment

How to Use

Upload an image or pick an example
Adjust the confidence threshold (default 0.5)
Click "Run Detection"
See annotated output with bounding boxes and per-detection confidence

Inference latency on HF's CPU tier: ~0.5–2 seconds per image.

Architecture

┌─────────────────────────────────────┐
│      Image Upload (PIL)             │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│  torchvision Transform              │
│  (resize, normalize)                │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│  MobileNetV3-Large FPN Backbone     │
│  (feature extraction)               │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│  Faster R-CNN Detection Head        │
│  (region proposals + classifier)    │
└──────────────┬──────────────────────┘
               │
               ▼
┌─────────────────────────────────────┐
│  Annotated Image + Detections List  │
└─────────────────────────────────────┘

Edge Deployment Path

The full vision-edge pipeline in the source repo additionally supports:

TFLite export for Android / iOS
INT8 quantization with post-training calibration
Edge TPU compilation for Google Coral boards
ONNX export for any ML runtime

License

MIT