| license: apache-2.0 | |
| base_model: facebook/dinov2-small | |
| tags: | |
| - vision | |
| - onnx | |
| - int8 | |
| - mobile | |
| - flutter | |
| - retrieval | |
| - site-recognition | |
| # WakeUp DINOv2-Small INT8 (ONNX) | |
| ONNX INT8 export of [`facebook/dinov2-small`](https://huggingface.co/facebook/dinov2-small) for the **WakeUp** Flutter alarm app's "Travel Mode" → Site Recognition feature. | |
| ## Why DINOv2? | |
| DINOv2 is self-supervised and explicitly optimized for **instance-level retrieval** (the same object across viewpoints / lighting). It outperforms CLIP-style models on this task by a wide margin. Used here for the Site Recognition mode where the user captures 3-5 photos of a specific object as anchors. | |
| ## Files | |
| | File | Size | Purpose | | |
| |---|---|---| | |
| | `dinov2_small_int8.onnx` | ~24 MB | Image feature extraction (CLS token) | | |
| | `model_metadata.json` | — | Normalization params, embedding dim | | |
| ## Inference | |
| ```python | |
| import onnxruntime as ort | |
| sess = ort.InferenceSession("dinov2_small_int8.onnx") | |
| # image: 1x3x224x224 normalized with DINOv2 ImageNet mean/std | |
| embedding = sess.run(None, {"pixel_values": pixel_values})[0] # shape (1, 384), L2-normalized | |
| ``` | |
| Anchors and scan results are compared via plain cosine similarity (dot product, since both are unit-norm). | |
| ## License | |
| Apache 2.0 (inherits from base model). | |