diff --git "a/index.html" "b/index.html" --- "a/index.html" +++ "b/index.html" @@ -1,693 +1,1377 @@
- - -Model Zoo — Edge AI Perception Models
+ +- Pre-trained models optimized for edge deployment, validated on real hardware with full-dataset accuracy metrics and per-platform timing breakdowns. Each model repo contains all sizes (nano through x-large) with ONNX FP32, TFLite INT8, and platform-specific compiled formats. -
+ + - + +publisher/hf_publisher/config.py
+ | Repo | Family | Task | +Sizes | Nano mAP50 (ONNX) | Nano mAP50-95 (INT8) | +Last modified | +
|---|
- - Every artifact in the Model Zoo is measured on the same dataset on the same hardware users deploy on. EdgeFirst Studio manages datasets, training, multi-format export, and reference validation; on-target runs happen on a board farm of i.MX 8M Plus, i.MX 95, Ara-240, Hailo, and Jetson devices. Accuracy numbers and per-stage timing are pushed back to Studio session metrics and consumed by this Model Zoo when generating each model card. -
+| Repo | Type | Task | +Downloads (30d) | Downloads (all-time) | Likes | +Files | Last modified | Created | +
|---|
+
+ - Each training session produces a single set of weights. The export pipeline emits ONNX FP32, INT8 TFLite, and platform-specific compiled formats (i.MX 95 Neutron, NXP Ara-240 DVM, Hailo HEF, Jetson TensorRT). Every output is paired with an on-target validation run that captures both accuracy (COCO/LVIS mAP against the validation set) and full-pipeline timing. The ONNX FP32 run from each training session serves as the reference baseline; quantization and runtime loss are measured relative to it, not relative to externally-published numbers. -
+
+ - The EdgeFirst Profiler is the on-target agent that drives every validation. Given a model and a dataset, it runs full inference on the target device, captures per-image predictions in the EdgeFirst Arrow/Parquet format, and emits a Perfetto trace alongside the predictions. The Profiler is hardware-aware: it loads each runtime through its native delegate — Verisilicon VX Delegate on i.MX 8M Plus, eIQ Neutron Delegate on i.MX 95, Kinara SDK on Ara-240, HailoRT on RPi5 + Hailo, TensorRT on Jetson — so every stage timed by the trace corresponds to what a deployed application would experience. -
+
- The EdgeFirst Validator is the off-target post-processor. It consumes the Profiler's predictions and Perfetto trace, computes the full 12-metric COCO accuracy tuple via pycocotools (or lvis-api for large-vocabulary datasets), and rebuilds per-stage timing summaries from the trace. Results land in a structured YAML payload attached to the Studio validation session — the same payload the Model Zoo reads to render this page. Accuracy and timing are computed independently of the runtime that produced the predictions, so toolchain regressions surface as cross-platform divergence rather than silent failures.
+
publisher/benchmarks/
+
+ On-target latency / FPS data is pending. Run
+ python -m hf_publisher.dashboard_snapshot
+ locally to pull Studio metrics, or populate
+ publisher/benchmarks/{version}-{task}.json,
+ then re-open this dashboard. The Ultralytics-style accuracy-vs-latency Pareto view will render here.
- The EdgeFirst Hardware Abstraction Layer (HAL) provides the hardware-accelerated primitives used at both validation and deployment time. The Profiler uses HAL for letterbox resize, color-space conversion, normalization, layout conversion, and post-decode (YOLO/ModelPack output decoding, NMS, mask materialisation). HAL automatically selects DMA-BUF, OpenGL ES, NXP G2D, or CPU paths depending on the platform — so the timing measured during validation reflects the same accelerated path a production runtime would take. HAL ships as a Rust crate, a Python package, and a C library under Apache 2.0.
+
+ Studio Validation Sessions
+ snapshot pending · run python -m hf_publisher.dashboard_snapshot
+
- Two complementary timing surfaces are reported per validation: -
-| Surface | What it captures | When it's present |
|---|---|---|
timing.inline | Per-image preprocess_ms, inference_ms, postprocess_ms with min / mean / median / p95 / p99 / max | Always — the universal contract every producer fills in |
timing.trace | Full per-stage breakdown from the Perfetto trace (typically 25–33 stages including delegate work, tensor moves, decode passes, NMS, parquet flush), plus end-to-end FPS distribution | When the Profiler emits a sidecar trace (almost all runs) |
- Throughput exceeds the sum of stage latencies because the runtime pipelines I/O, host preprocessing, NPU inference, and decode across frames. Reporting 1000 / (preprocess + inference + postprocess) understates real throughput; the Model Zoo uses the measured end-to-end FPS from the trace (trace.fps.median) as the headline number reported in every per-target table on every model card. As a concrete example, YOLOv5n on i.MX 95 Neutron has per-stage means 21.7 ms preprocess + 12.2 ms inference + 15.8 ms postprocess (naive estimate ~20 FPS), but the measured pipelined throughput is 56 FPS median — the 2.8× gap is the value the pipelining delivers.
-
research/modelzoo-roadmap.md
+ | Family | ONNX | i.MX 8M Plus | i.MX 95 | Ara240 | Hailo | Jetson |
|---|---|---|---|---|---|---|
| YOLO26 | ✓ | ✓ [1] | ✓ [1] | ✓ | ✓ | ✓ |
| YOLO11 | ✓ | ✓ | ✓ [1] | ✓ | ✓ | ✓ |
| YOLOv8 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| YOLOv5 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
+ Currently shows Nano mAP for the universal ONNX FP32 / TFLite INT8 baselines.
+ Per-accelerator leaderboards will populate as publisher/benchmarks/
+ gains timing entries and target-compiled validation sessions land in Studio.
+
- ✓ indicates a compiled model artifact is published in the corresponding repo. [1] Some (size, platform) combinations are work in progress and are not yet showing a number in the per-repo model card's On-target validation results table. Two reasons cover most cases: timing — larger sizes on slower NPUs are not yet a validation priority and will roll in as bandwidth allows; and accuracy or performance investigations — for example, YOLO11 and YOLO26 on the i.MX 95 Neutron currently show a quantization regression that we are tracking with NXP. In every case the underlying Studio validation session (v-XXXX) remains linked in the model card so its current status can be inspected.
-
ONNX FP32 mAP@0.5 on COCO val2017 (5000 images, 80 classes). Nano size for each family. Source: EdgeFirst Studio validation sessions cited in each model card.
- + + +