vision-edge / README.md
WolfDavid's picture
Initial deploy: MobileNetV3 Faster R-CNN object detection
844ee22
---
title: Vision Edge
emoji: πŸ‘
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 5.9.1
python_version: "3.11"
app_file: app.py
pinned: false
license: mit
tags:
- object-detection
- computer-vision
- mobilenetv3
- faster-rcnn
- edge-deployment
short_description: Object detection with MobileNetV3 Faster R-CNN
---
# Vision Edge β€” Object Detection
Real-time object detection using **torchvision's Faster R-CNN with
MobileNetV3-Large FPN backbone**, pre-trained on COCO.
## What This Demonstrates
- **Edge-friendly architecture** β€” MobileNetV3 is designed for mobile and
edge inference, with 8-10Γ— fewer parameters than ResNet-50
- **Pre-trained on COCO** β€” 91 classes including people, vehicles,
animals, furniture, food, sports equipment
- **CPU-only inference** β€” runs on HF's free tier without any GPU
- **Production export pipeline** β€” the full source repo supports TFLite,
ONNX, INT8 quantization, and Edge TPU deployment
## How to Use
1. Upload an image or pick an example
2. Adjust the confidence threshold (default 0.5)
3. Click "Run Detection"
4. See annotated output with bounding boxes and per-detection confidence
Inference latency on HF's CPU tier: ~0.5–2 seconds per image.
## Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Image Upload (PIL) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ torchvision Transform β”‚
β”‚ (resize, normalize) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ MobileNetV3-Large FPN Backbone β”‚
β”‚ (feature extraction) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Faster R-CNN Detection Head β”‚
β”‚ (region proposals + classifier) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Annotated Image + Detections List β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
## Edge Deployment Path
The full `vision-edge` pipeline in the source repo additionally supports:
- **TFLite export** for Android / iOS
- **INT8 quantization** with post-training calibration
- **Edge TPU** compilation for Google Coral boards
- **ONNX export** for any ML runtime
## License
MIT