Upload AnyCalib ONNX exports (FP32 + FP16 + INT8)

Browse files

Files changed (5) hide show

README.md +103 -0
config.json +56 -0
model_fp16.onnx +3 -0
model_fp32.onnx +3 -0
model_int8.onnx +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,103 @@

+---
+tags:
+  - onnx
+  - camera-calibration
+  - anycalib
+  - computer-vision
+  - lens-correction
+  - wasm
+  - onnxruntime-web
+library_name: onnxruntime
+pipeline_tag: image-to-image
+license: apache-2.0
+---
+# AnyCalib ONNX — Ray Prediction Head
+ONNX export of the [AnyCalib](https://github.com/javrtg/AnyCalib) ray prediction neural network,
+ready for deployment with ONNX Runtime (Python, C++, Web/WASM, Mobile).
+## Variants
+| Variant | File | Size | Use Case |
+|---------|------|------|----------|
+| FP32 | `model_fp32.onnx` | 1222 MB | Maximum accuracy |
+| FP16 | `model_fp16.onnx` | 611 MB | Good accuracy, half memory |
+| INT8 | `model_int8.onnx` | 311 MB | Fastest, smallest, quantized |
+## Architecture
+- **Backbone**: DINOv2 ViT-L/14 (304M params)
+- **Decoder**: LightDPT (15.2M params)
+- **Head**: ConvexTangentDecoder (0.6M params)
+- **Source model**: `anycalib_gen`
+## Usage — Python
+```python
+import onnxruntime as ort
+import numpy as np
+sess = ort.InferenceSession("model_fp16.onnx")
+# RGB [0,1], size must be divisible by 14
+image = np.random.rand(1, 3, 518, 518).astype(np.float32)
+rays, tangent_coords = sess.run(None, {"image": image})
+# rays: (1, 3, 518, 518) — unit rays per pixel
+# tangent_coords: (1, 2, 518, 518) — tangent space coords
+```
+## Usage — ONNX Runtime Web (WASM)
+```javascript
+import * as ort from 'onnxruntime-web';
+// Use WASM backend
+ort.env.wasm.wasmPaths = 'https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/';
+const session = await ort.InferenceSession.create('./model_int8.onnx', {
+  executionProviders: ['wasm'],
+});
+// Prepare input: (1, 3, 518, 518) RGB float32
+const imageData = new Float32Array(1 * 3 * 518 * 518);
+// ... fill with normalized RGB data ...
+const inputTensor = new ort.Tensor('float32', imageData, [1, 3, 518, 518]);
+const results = await session.run({ image: inputTensor });
+const rays = results.rays;           // (1, 3, 518, 518)
+const tangentCoords = results.tangent_coords;  // (1, 2, 518, 518)
+```
+## Usage — Transformers.js
+```javascript
+import { env } from '@huggingface/transformers';
+// Point to this repo
+env.allowLocalModels = false;
+// Load ONNX model directly
+const session = await ort.InferenceSession.create(
+  'https://huggingface.co/SebRincon/anycalib-onnx/resolve/main/model_int8.onnx'
+);
+```
+## Input/Output Spec
+- **Input**: `image` — `(B, 3, H, W)` RGB float32 in `[0, 1]`, H and W divisible by 14
+- **Output**: `rays` — `(B, 3, H, W)` unit rays on S^2 manifold
+- **Output**: `tangent_coords` — `(B, 2, H, W)` tangent space coordinates
+## Note
+The **Calibrator** (RANSAC + Gauss-Newton camera fitting) is NOT included in the ONNX model.
+It must run as a lightweight CPU post-processing step. See the
+[calibrator implementation](https://github.com/javrtg/AnyCalib) for details.
+## Related
+- [AnyCalib Raw](https://huggingface.co/SebRincon/anycalib) — Raw PyTorch weights (safetensors)
+- [AnyCalib Source](https://github.com/javrtg/AnyCalib) — Original repository

config.json ADDED Viewed

	@@ -0,0 +1,56 @@

+{
+  "model_type": "anycalib_ray_head",
+  "source_model": "anycalib_gen",
+  "description": "AnyCalib ray prediction head (backbone + decoder + head). Calibrator (RANSAC + Gauss-Newton) must run in post-processing.",
+  "input": {
+    "name": "image",
+    "shape": [
+      "batch",
+      3,
+      518,
+      518
+    ],
+    "dtype": "float32",
+    "range": [
+      0.0,
+      1.0
+    ],
+    "color": "RGB"
+  },
+  "outputs": [
+    {
+      "name": "rays",
+      "shape": [
+        "batch",
+        3,
+        518,
+        518
+      ],
+      "dtype": "float32"
+    },
+    {
+      "name": "tangent_coords",
+      "shape": [
+        "batch",
+        2,
+        518,
+        518
+      ],
+      "dtype": "float32"
+    }
+  ],
+  "architecture": {
+    "backbone": "DINOv2 ViT-L/14 (304M params)",
+    "decoder": "LightDPT (15.2M params)",
+    "head": "ConvexTangentDecoder (0.6M params)",
+    "total_params": "~320M"
+  },
+  "variants": {
+    "fp32": "1222.0 MB",
+    "fp16": "611.3 MB",
+    "int8": "311.1 MB"
+  },
+  "opset_version": 17,
+  "edge_divisible_by": 14,
+  "recommended_input_size": 518
+}

model_fp16.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:69aa3a57d91f1726d9daf5b0b2a93a1d28560ffdd68d02b12a3181e1a80f58dc
+size 640943850

model_fp32.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:21e64466e38cb0092bc64adee6fc06aebb9529fb22a069ca244c883cd0f751a7
+size 1281329264

model_int8.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:06be6e93ca5647c8140a67db87ce42bae335f2913187a9acfb9b83c1bdfe4b4b
+size 326196551