Spaces:
Sleeping
Sleeping
| title: Vision Edge | |
| emoji: π | |
| colorFrom: green | |
| colorTo: blue | |
| sdk: gradio | |
| sdk_version: 5.9.1 | |
| python_version: "3.11" | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| tags: | |
| - object-detection | |
| - computer-vision | |
| - mobilenetv3 | |
| - faster-rcnn | |
| - edge-deployment | |
| short_description: Object detection with MobileNetV3 Faster R-CNN | |
| # Vision Edge β Object Detection | |
| Real-time object detection using **torchvision's Faster R-CNN with | |
| MobileNetV3-Large FPN backbone**, pre-trained on COCO. | |
| ## What This Demonstrates | |
| - **Edge-friendly architecture** β MobileNetV3 is designed for mobile and | |
| edge inference, with 8-10Γ fewer parameters than ResNet-50 | |
| - **Pre-trained on COCO** β 91 classes including people, vehicles, | |
| animals, furniture, food, sports equipment | |
| - **CPU-only inference** β runs on HF's free tier without any GPU | |
| - **Production export pipeline** β the full source repo supports TFLite, | |
| ONNX, INT8 quantization, and Edge TPU deployment | |
| ## How to Use | |
| 1. Upload an image or pick an example | |
| 2. Adjust the confidence threshold (default 0.5) | |
| 3. Click "Run Detection" | |
| 4. See annotated output with bounding boxes and per-detection confidence | |
| Inference latency on HF's CPU tier: ~0.5β2 seconds per image. | |
| ## Architecture | |
| ``` | |
| βββββββββββββββββββββββββββββββββββββββ | |
| β Image Upload (PIL) β | |
| ββββββββββββββββ¬βββββββββββββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββ | |
| β torchvision Transform β | |
| β (resize, normalize) β | |
| ββββββββββββββββ¬βββββββββββββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββ | |
| β MobileNetV3-Large FPN Backbone β | |
| β (feature extraction) β | |
| ββββββββββββββββ¬βββββββββββββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββ | |
| β Faster R-CNN Detection Head β | |
| β (region proposals + classifier) β | |
| ββββββββββββββββ¬βββββββββββββββββββββββ | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββ | |
| β Annotated Image + Detections List β | |
| βββββββββββββββββββββββββββββββββββββββ | |
| ``` | |
| ## Edge Deployment Path | |
| The full `vision-edge` pipeline in the source repo additionally supports: | |
| - **TFLite export** for Android / iOS | |
| - **INT8 quantization** with post-training calibration | |
| - **Edge TPU** compilation for Google Coral boards | |
| - **ONNX export** for any ML runtime | |
| ## License | |
| MIT | |