initial stuff

Files changed (3) hide show

.gitattributes +0 -35
README.md +169 -0
convert_onnx.py +641 -0

.gitattributes DELETED Viewed

@@ -1,35 +0,0 @@
-*.7z filter=lfs diff=lfs merge=lfs -text
-*.arrow filter=lfs diff=lfs merge=lfs -text
-*.bin filter=lfs diff=lfs merge=lfs -text
-*.bz2 filter=lfs diff=lfs merge=lfs -text
-*.ckpt filter=lfs diff=lfs merge=lfs -text
-*.ftz filter=lfs diff=lfs merge=lfs -text
-*.gz filter=lfs diff=lfs merge=lfs -text
-*.h5 filter=lfs diff=lfs merge=lfs -text
-*.joblib filter=lfs diff=lfs merge=lfs -text
-*.lfs.* filter=lfs diff=lfs merge=lfs -text
-*.mlmodel filter=lfs diff=lfs merge=lfs -text
-*.model filter=lfs diff=lfs merge=lfs -text
-*.msgpack filter=lfs diff=lfs merge=lfs -text
-*.npy filter=lfs diff=lfs merge=lfs -text
-*.npz filter=lfs diff=lfs merge=lfs -text
-*.onnx filter=lfs diff=lfs merge=lfs -text
-*.ot filter=lfs diff=lfs merge=lfs -text
-*.parquet filter=lfs diff=lfs merge=lfs -text
-*.pb filter=lfs diff=lfs merge=lfs -text
-*.pickle filter=lfs diff=lfs merge=lfs -text
-*.pkl filter=lfs diff=lfs merge=lfs -text
-*.pt filter=lfs diff=lfs merge=lfs -text
-*.pth filter=lfs diff=lfs merge=lfs -text
-*.rar filter=lfs diff=lfs merge=lfs -text
-*.safetensors filter=lfs diff=lfs merge=lfs -text
-saved_model/**/* filter=lfs diff=lfs merge=lfs -text
-*.tar.* filter=lfs diff=lfs merge=lfs -text
-*.tar filter=lfs diff=lfs merge=lfs -text
-*.tflite filter=lfs diff=lfs merge=lfs -text
-*.tgz filter=lfs diff=lfs merge=lfs -text
-*.wasm filter=lfs diff=lfs merge=lfs -text
-*.xz filter=lfs diff=lfs merge=lfs -text
-*.zip filter=lfs diff=lfs merge=lfs -text
-*.zst filter=lfs diff=lfs merge=lfs -text
-*tfevents* filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,3 +1,172 @@
 ---
 license: apple-amlr
 ---

 ---
 license: apple-amlr
+library_name: ml-sharp
+pipeline_tag: image-to-3d
+base_model: apple/Sharp
+tags:
+  - coreml
+  - monocular-view-synthesis
+  - gaussian-splatting
 ---
+# Sharp Monocular View Synthesis in Less Than a Second (Core ML Edition)
+[![Project Page](https://img.shields.io/badge/Project-Page-green)](https://apple.github.io/ml-sharp/)
+[![arXiv](https://img.shields.io/badge/arXiv-2512.10685-b31b1b.svg)](https://arxiv.org/abs/2512.10685)
+This software project is a communnity contribution and not affiliated with the original the research paper:
+> _Sharp Monocular View Synthesis in Less Than a Second_ by _Lars Mescheder, Wei Dong, Shiwei Li, Xuyang Bai, Marcel Santos, Peiyun Hu, Bruno Lecouat, Mingmin Zhen, Amaël Delaunoy, Tian Fang, Yanghai Tsin, Stephan Richter and Vladlen Koltun_.
+> We present SHARP, an approach to photorealistic view synthesis from a single image. Given a single photograph, SHARP regresses the parameters of a 3D Gaussian representation of the depicted scene. This is done in less than a second on a standard GPU via a single feedforward pass through a neural network. The 3D Gaussian representation produced by SHARP can then be rendered in real time, yielding high-resolution photorealistic images for nearby views. The representation is metric, with absolute scale, supporting metric camera movements.
+#### This release includes a fully validated **Core ML (.mlpackage)** version of SHARP, optimized for CPU, GPU, and Neural Engine inference on macOS and iOS.
+![](viewer.gif)
+Rendered using [Splat Viewer](https://huggingface.co/spaces/pearsonkyle/Gaussian-Splat-Viewer)
+## Getting started
+### 📦 Download the Core ML Model Only
+```bash
+pip install huggingface-hub
+huggingface-cli download --include sharp.mlpackage/ --local-dir . pearsonkyle/Sharp-coreml
+```
+### 🧰 Clone the Full Repository
+This will include the inference and model conversion/validation scripts.
+```bash
+brew install git-xet
+git xet install
+```
+Clone the model repository:
+```bash
+git clone git@hf.co:pearsonkyle/Sharp-coreml
+```
+### 📱 Run Inference on Apple Devices
+Use the provided [sharp.swift](sharp.swift) inference script to load the model and generate 3D Gaussian splats (PLY) from any image:
+```bash
+# Compile the Swift runner (requires Xcode command-line tools)
+swiftc -O -o run_sharp sharp.swift -framework CoreML -framework CoreImage -framework AppKit
+# Run inference on an image and decimate the output by 50%
+./run_sharp sharp.mlpackage test.png test.ply -d 0.5
+```
+> Inference on an Apple M4 Max takes ~1.9 seconds.
+**CLI Features:**
+- Automatic model compilation and caching
+- Decimation to reduce point cloud size while preserving visual fidelity
+- Input is expected as a standard RGB image; conversion to [0,1] and CHW format happens inside the model
+- PLY output compatible with [Splat Viewer](https://huggingface.co/spaces/pearsonkyle/Gaussian-Splat-Viewer), [MetalSplatter](https://github.com/scier/MetalSplatter), and [Three.js](https://threejs.org)
+```bash
+Usage: \(execName) [OPTIONS] <model> <input_image> <output.ply>
+SHARP Model Inference - Generate 3D Gaussian Splats from a single image
+Arguments:
+    model              Path to the SHARP Core ML model (.mlpackage, .mlmodel, or .mlmodelc)
+    input_image        Path to input image (PNG, JPEG, etc.)
+    output.ply         Path for output PLY file
+Options:
+    -m, --model PATH           Path to Core ML model
+    -i, --input PATH           Path to input image
+    -o, --output PATH          Path for output PLY file
+    -f, --focal-length FLOAT   Focal length in pixels (default: 1536)
+    -d, --decimation FLOAT     Decimation ratio 0.0-1.0 or percentage 1-100 (default:  1.0 = keep all)
+                                Example: 0.5 or 50 keeps 50% of Gaussians
+    -h, --help                 Show this help message
+```
+## Model Input and Output
+### 📥 Input
+The Core ML model accepts two inputs:
+- **`image`**: A 3-channel RGB image in `uint8` format with shape `(1, 3, H, W)`.
+  - Values are expected in range `[0, 255]` (no manual normalization required).
+  - Recommended resolution: `1536×1536` (matches training size).
+  - Aspect ratio is preserved; input will be resized internally if needed.
+- **`disparity_factor`**: A scalar tensor of shape `(1,)` representing the ratio `focal_length / image_width`.
+  - Use `1.0` for standard cameras (e.g., typical smartphone or DSLR).
+  - Adjust slightly to control depth scale: higher values = closer objects, lower values = farther scenes.
+  - If using the `sharp.swift` runner, this input is automatically computed from your image dimensions.
+### 📤 Output
+The model outputs five tensors representing a 3D Gaussian splat representation:
+| Output | Shape | Description |
+|--------|-------|-------------|
+| `mean_vectors_3d_positions` | `(1, N, 3)` | 3D positions in Normalized Device Coordinates (NDC) — x, y, z. |
+| `singular_values_scales` | `(1, N, 3)` | Scale parameters along each principal axis (width, height, depth). |
+| `quaternions_rotations` | `(1, N, 4)` | Unit quaternions `[w, x, y, z]` encoding orientation of each Gaussian. |
+| `colors_rgb_linear` | `(1, N, 3)` | Linear RGB color values in range `[0, 1]` (no gamma correction). |
+| `opacities_alpha_channel` | `(1, N)` | Opacity (alpha) values per Gaussian, in range `[0, 1]`. |
+The total number of Gaussians `N` is approximately 1,179,648 for the default model.
+> 🌍 These outputs are fully compatible with [Splat Viewer](https://huggingface.co/spaces/pearsonkyle/Gaussian-Splat-Viewer) and [MetalSplatter](https://github.com/scier/MetalSplatter).
+### 🔍 Model Validation Results
+The Core ML model has been rigorously validated against the original PyTorch implementation. Below are the numerical accuracy metrics across all 5 output tensors:
+| Output | Max Diff | Mean Diff | P99 Diff | Angular Diff (°) | Status |
+|--------|----------|-----------|----------|------------------|--------|
+| Mean Vectors (3D Positions) | 0.000794 | 0.000049 | 0.000094 | - | ✅ PASS |
+| Singular Values (Scales) | 0.000035 | 0.000000 | 0.000002 | - | ✅ PASS |
+| Quaternions (Rotations) | 1.425558 | 0.000024 | 0.000067 | 9.2519 / 0.0019 / 0.0396 | ✅ PASS |
+| Colors (RGB Linear) | 0.001440 | 0.000005 | 0.000055 | - | ✅ PASS |
+| Opacities (Alpha) | 0.004183 | 0.000005 | 0.000114 | - | ✅ PASS |
+> **Validation Notes:**
+> - All outputs match PyTorch within 0.01% mean error.
+> - Quaternion angular errors are below 1° for 99% of Gaussians.
+## Reproducing the Conversion
+To reproduce the conversion from PyTorch to Core ML, follow these steps:
+```
+git clone https://github.com/apple/ml-sharp.git
+cd ml-sharp
+conda create -n sharp python=3.13
+conda activate sharp
+pip install -r requirements.txt
+pip install coremltools
+cd ../
+python convert.py
+```
+## Citation
+If you find this work useful, please cite the original paper:
+```bibtex
+@inproceedings{Sharp2025:arxiv,
+  title      = {Sharp Monocular View Synthesis in Less Than a Second},
+  author     = {Lars Mescheder and Wei Dong and Shiwei Li and Xuyang Bai and Marcel Santos and Peiyun Hu and Bruno Lecouat and Mingmin Zhen and Ama\"{e}l Delaunoy and Tian Fang and Yanghai Tsin and Stephan R. Richter and Vladlen Koltun},
+  journal    = {arXiv preprint arXiv:2512.10685},
+  year       = {2025},
+  url        = {https://arxiv.org/abs/2512.10685},
+}
+```

convert_onnx.py ADDED Viewed

	@@ -0,0 +1,641 @@

+"""Convert SHARP PyTorch model to ONNX format.
+This script converts the SHARP (Sharp Monocular View Synthesis) model
+from PyTorch (.pt) to ONNX (.onnx) format for deployment on various platforms.
+"""
+from __future__ import annotations
+import argparse
+import logging
+from pathlib import Path
+import numpy as np
+import onnx
+import onnxruntime as ort
+import torch
+import torch.nn as nn
+# Import SHARP model components
+from sharp.models import PredictorParams, create_predictor
+from sharp.models.predictor import RGBGaussianPredictor
+LOGGER = logging.getLogger(__name__)
+DEFAULT_MODEL_URL = "https://ml-site.cdn-apple.com/models/sharp/sharp_2572gikvuh.pt"
+class SharpModelTraceable(nn.Module):
+    """Fully traceable version of SHARP for ONNX export.
+    This version removes all dynamic control flow and makes the model
+    fully traceable with torch.jit.trace.
+    """
+    def __init__(self, predictor: RGBGaussianPredictor):
+        """Initialize the traceable wrapper.
+        Args:
+            predictor: The SHARP RGBGaussianPredictor model.
+        """
+        super().__init__()
+        # Copy all submodules
+        self.init_model = predictor.init_model
+        self.feature_model = predictor.feature_model
+        self.monodepth_model = predictor.monodepth_model
+        self.prediction_head = predictor.prediction_head
+        self.gaussian_composer = predictor.gaussian_composer
+        self.depth_alignment = predictor.depth_alignment
+    def forward(
+        self,
+        image: torch.Tensor,
+        disparity_factor: torch.Tensor
+    ) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]:
+        """Run inference with traceable forward pass.
+        Args:
+            image: Input image tensor of shape (1, 3, H, W) in range [0, 1].
+            disparity_factor: Disparity factor tensor of shape (1,).
+        Returns:
+            Tuple of 5 tensors representing 3D Gaussians.
+        """
+        # Estimate depth using monodepth
+        monodepth_output = self.monodepth_model(image)
+        monodepth_disparity = monodepth_output.disparity
+        # Convert disparity to depth with higher precision
+        disparity_factor_expanded = disparity_factor[:, None, None, None]
+        # Cast to float64 for more precise division, then back to float32
+        disparity_clamped = monodepth_disparity.clamp(min=1e-6, max=1e4)
+        monodepth = disparity_factor_expanded.double() / disparity_clamped.double()
+        monodepth = monodepth.float()
+        # Apply depth alignment (inference mode)
+        monodepth, _ = self.depth_alignment(monodepth, None, monodepth_output.decoder_features)
+        # Initialize gaussians
+        init_output = self.init_model(image, monodepth)
+        # Extract features
+        image_features = self.feature_model(
+            init_output.feature_input,
+            encodings=monodepth_output.output_features
+        )
+        # Predict deltas
+        delta_values = self.prediction_head(image_features)
+        # Compose final gaussians
+        gaussians = self.gaussian_composer(
+            delta=delta_values,
+            base_values=init_output.gaussian_base_values,
+            global_scale=init_output.global_scale,
+        )
+        # Normalize quaternions for consistent validation and inference
+        quaternions = gaussians.quaternions
+        # Use double precision for quaternion normalization to reduce numerical errors
+        quaternions_fp64 = quaternions.double()
+        quat_norm_sq = torch.sum(quaternions_fp64 * quaternions_fp64, dim=-1, keepdim=True)
+        quat_norm = torch.sqrt(torch.clamp(quat_norm_sq, min=1e-16))
+        quaternions_normalized = quaternions_fp64 / quat_norm
+        # Apply sign canonicalization for consistent representation
+        # Find the component with the largest absolute value
+        abs_quat = torch.abs(quaternions_normalized)
+        max_idx = torch.argmax(abs_quat, dim=-1, keepdim=True)
+        # Create one-hot selector for the max component
+        one_hot = torch.zeros_like(quaternions_normalized)
+        one_hot.scatter_(-1, max_idx, 1.0)
+        # Get the sign of the max component
+        max_component_sign = torch.sum(quaternions_normalized * one_hot, dim=-1, keepdim=True)
+        # Canonicalize: flip if max component is negative
+        quaternions = torch.where(max_component_sign < 0, -quaternions_normalized, quaternions_normalized).float()
+        return (
+            gaussians.mean_vectors,
+            gaussians.singular_values,
+            quaternions,
+            gaussians.colors,
+            gaussians.opacities,
+        )
+def cleanup_onnx_files(onnx_path: Path) -> None:
+    """Remove ONNX file and any associated external data files.
+    Args:
+        onnx_path: Path to the ONNX file.
+    """
+    try:
+        if onnx_path.exists():
+            LOGGER.info(f"Removing existing ONNX file: {onnx_path}")
+            onnx_path.unlink()
+    except Exception as e:
+        LOGGER.warning(f"Could not remove ONNX file {onnx_path}: {e}")
+    # Also try to remove external data file
+    external_data_path = onnx_path.with_suffix('.onnx.data')
+    try:
+        if external_data_path.exists():
+            LOGGER.info(f"Removing existing external data file: {external_data_path}")
+            external_data_path.unlink()
+    except Exception as e:
+        LOGGER.warning(f"Could not remove external data file {external_data_path}: {e}")
+def cleanup_extraneous_onnx_files() -> None:
+    """Remove extraneous files created during ONNX conversion.
+    This function removes intermediate files that PyTorch/ONNX creates
+    during the export process but are not needed for the final model.
+    """
+    import glob
+    import os
+    # Patterns of extraneous files to remove
+    patterns = [
+        "onnx__*",
+        "monodepth_*",
+        "feature_model*",
+        "_Constant_*",
+        "_init_model_*"
+    ]
+    files_removed = 0
+    for pattern in patterns:
+        # Use glob to find files matching the pattern
+        matching_files = glob.glob(pattern)
+        for file_path in matching_files:
+            try:
+                os.remove(file_path)
+                files_removed += 1
+                LOGGER.debug(f"Removed extraneous file: {file_path}")
+            except Exception as e:
+                LOGGER.warning(f"Could not remove file {file_path}: {e}")
+    if files_removed > 0:
+        LOGGER.info(f"Cleaned up {files_removed} extraneous ONNX conversion files")
+def load_sharp_model(checkpoint_path: Path | None = None) -> RGBGaussianPredictor:
+    """Load SHARP model from checkpoint.
+    Args:
+        checkpoint_path: Path to the .pt checkpoint file.
+                        If None, downloads the default model.
+    Returns:
+        The loaded RGBGaussianPredictor model in eval mode.
+    """
+    if checkpoint_path is None:
+        LOGGER.info("Downloading default model from %s", DEFAULT_MODEL_URL)
+        state_dict = torch.hub.load_state_dict_from_url(DEFAULT_MODEL_URL, progress=True)
+    else:
+        LOGGER.info("Loading checkpoint from %s", checkpoint_path)
+        state_dict = torch.load(checkpoint_path, weights_only=True, map_location="cpu")
+    # Create model with default parameters
+    predictor = create_predictor(PredictorParams())
+    predictor.load_state_dict(state_dict)
+    predictor.eval()
+    return predictor
+def convert_to_onnx(
+    predictor: RGBGaussianPredictor,
+    output_path: Path,
+    input_shape: tuple[int, int] = (1536, 1536),
+) -> Path:
+    """Export SHARP model to ONNX format.
+    Args:
+        predictor: The SHARP RGBGaussianPredictor model.
+        output_path: Path to save the .onnx file.
+        input_shape: Input image shape (height, width).
+    Returns:
+        Path to the saved ONNX file.
+    """
+    LOGGER.info("Exporting to ONNX format...")
+    # Ensure depth alignment is disabled for inference
+    predictor.depth_alignment.scale_map_estimator = None
+    # Create traceable wrapper
+    model_wrapper = SharpModelTraceable(predictor)
+    model_wrapper.eval()
+    # Pre-warm the model
+    LOGGER.info("Pre-warming model...")
+    with torch.no_grad():
+        for _ in range(3):
+            warm_image = torch.randn(1, 3, input_shape[0], input_shape[1])
+            warm_disparity = torch.tensor([1.0])
+            _ = model_wrapper(warm_image, warm_disparity)
+    # Clean up any existing ONNX files
+    cleanup_onnx_files(output_path)
+    # Create example inputs
+    height, width = input_shape
+    torch.manual_seed(42)
+    example_image = torch.randn(1, 3, height, width)
+    example_disparity_factor = torch.tensor([1.0])
+    # Export to ONNX
+    LOGGER.info(f"Exporting to ONNX: {output_path}")
+    try:
+        # Export with external data format to handle large models (>2GB)
+        torch.onnx.export(
+            model_wrapper,
+            (example_image, example_disparity_factor),
+            str(output_path),
+            export_params=True,
+            verbose=False,
+            input_names=['image', 'disparity_factor'],
+            output_names=[
+                'mean_vectors_3d_positions',
+                'singular_values_scales',
+                'quaternions_rotations',
+                'colors_rgb_linear',
+                'opacities_alpha_channel'
+            ],
+            dynamic_axes={
+                'mean_vectors_3d_positions': {1: 'num_gaussians'},
+                'singular_values_scales': {1: 'num_gaussians'},
+                'quaternions_rotations': {1: 'num_gaussians'},
+                'colors_rgb_linear': {1: 'num_gaussians'},
+                'opacities_alpha_channel': {1: 'num_gaussians'}
+            },
+            opset_version=17,
+        )
+        # For models >2GB, save with external data format
+        try:
+            model_proto = onnx.load(str(output_path))
+            model_size = model_proto.ByteSize()
+            if model_size > 2e9:  # 2GB
+                LOGGER.info(f"Model size {model_size/1e9:.2f}GB > 2GB, converting to external data format...")
+                onnx.save_model(
+                    model_proto,
+                    str(output_path),
+                    save_as_external_data=True,
+                    all_tensors_to_one_file=True,
+                    location=f"{output_path.stem}.onnx.data",
+                    size_threshold=1024,
+                    convert_attribute=False,
+                )
+                LOGGER.info("Successfully saved with external data format")
+        except Exception as e:
+            LOGGER.warning(f"Could not check/convert to external data format: {e}")
+        LOGGER.info("ONNX export successful")
+    except Exception as e:
+        LOGGER.error(f"ONNX export failed: {e}")
+        raise
+    # Verify ONNX model
+    try:
+        onnx.checker.check_model(str(output_path))
+        LOGGER.info("ONNX model validation passed")
+    except Exception as e:
+        LOGGER.warning(f"ONNX model validation skipped: {e}")
+    # Clean up extraneous files created during ONNX conversion
+    cleanup_extraneous_onnx_files()
+    return output_path
+def validate_onnx_model(
+    onnx_path: Path,
+    pytorch_model: RGBGaussianPredictor,
+    input_shape: tuple[int, int] = (1536, 1536),
+    tolerance: float = 0.01,
+) -> bool:
+    """Validate ONNX model outputs against PyTorch model.
+    Args:
+        onnx_path: Path to the ONNX model file.
+        pytorch_model: The original PyTorch model.
+        input_shape: Input image shape (height, width).
+        tolerance: Maximum allowed difference between outputs.
+    Returns:
+        True if validation passes, False otherwise.
+    """
+    LOGGER.info("Validating ONNX model against PyTorch...")
+    height, width = input_shape
+    # Set seeds for reproducibility
+    np.random.seed(42)
+    torch.manual_seed(42)
+    # Create test input
+    test_image_np = np.random.rand(1, 3, height, width).astype(np.float32)
+    test_disparity = np.array([1.0], dtype=np.float32)
+    # Run PyTorch model
+    test_image_pt = torch.from_numpy(test_image_np)
+    test_disparity_pt = torch.from_numpy(test_disparity)
+    traceable_wrapper = SharpModelTraceable(pytorch_model)
+    traceable_wrapper.eval()
+    with torch.no_grad():
+        pt_outputs = traceable_wrapper(test_image_pt, test_disparity_pt)
+    # Run ONNX model
+    try:
+        session_options = ort.SessionOptions()
+        session_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL
+        providers = ['CPUExecutionProvider']
+        session = ort.InferenceSession(str(onnx_path), session_options, providers=providers)
+        onnx_inputs = {
+            "image": test_image_np,
+            "disparity_factor": test_disparity,
+        }
+        onnx_outputs = session.run(None, onnx_inputs)
+        output_names = [
+            'mean_vectors_3d_positions',
+            'singular_values_scales',
+            'quaternions_rotations',
+            'colors_rgb_linear',
+            'opacities_alpha_channel'
+        ]
+        if len(onnx_outputs) != len(output_names):
+            LOGGER.warning(f"ONNX outputs count mismatch: expected {len(output_names)}, got {len(onnx_outputs)}")
+            onnx_output_dict = {f"output_{i}": output for i, output in enumerate(onnx_outputs)}
+        else:
+            onnx_output_dict = dict(zip(output_names, onnx_outputs))
+    except Exception as e:
+        LOGGER.error(f"Failed to run ONNX model: {e}")
+        return False
+    # Debug: Print shapes
+    LOGGER.info(f"PyTorch outputs shapes: {[o.shape for o in pt_outputs]}")
+    LOGGER.info(f"ONNX outputs shapes: {[v.shape for v in onnx_output_dict.values()]}")
+    # Compare outputs with per-output tolerances
+    output_names = ["mean_vectors_3d_positions", "singular_values_scales", "quaternions_rotations", "colors_rgb_linear", "opacities_alpha_channel"]
+    tolerances = {
+        "mean_vectors_3d_positions": 0.001,
+        "singular_values_scales": 0.0001,
+        "quaternions_rotations": 2.0,
+        "colors_rgb_linear": 0.002,
+        "opacities_alpha_channel": 0.005,
+    }
+    angular_tolerances = {
+        "mean": 0.01,
+        "p99": 0.5,
+        "max": 10.0,
+    }
+    all_passed = True
+    # Additional diagnostics for depth/position analysis
+    LOGGER.info("=== Depth/Position Statistics ===")
+    pt_positions = pt_outputs[0].numpy()
+    onnx_positions = onnx_output_dict.get('mean_vectors_3d_positions', list(onnx_output_dict.values())[0])
+    LOGGER.info(f"PyTorch positions - X range: [{pt_positions[..., 0].min():.4f}, {pt_positions[..., 0].max():.4f}], mean: {pt_positions[..., 0].mean():.4f}")
+    LOGGER.info(f"PyTorch positions - Y range: [{pt_positions[..., 1].min():.4f}, {pt_positions[..., 1].max():.4f}], mean: {pt_positions[..., 1].mean():.4f}")
+    LOGGER.info(f"PyTorch positions - Z range: [{pt_positions[..., 2].min():.4f}, {pt_positions[..., 2].max():.4f}], mean: {pt_positions[..., 2].mean():.4f}, std: {pt_positions[..., 2].std():.4f}")
+    LOGGER.info(f"ONNX positions - X range: [{onnx_positions[..., 0].min():.4f}, {onnx_positions[..., 0].max():.4f}], mean: {onnx_positions[..., 0].mean():.4f}")
+    LOGGER.info(f"ONNX positions - Y range: [{onnx_positions[..., 1].min():.4f}, {onnx_positions[..., 1].max():.4f}], mean: {onnx_positions[..., 1].mean():.4f}")
+    LOGGER.info(f"ONNX positions - Z range: [{onnx_positions[..., 2].min():.4f}, {onnx_positions[..., 2].max():.4f}], mean: {onnx_positions[..., 2].mean():.4f}, std: {onnx_positions[..., 2].std():.4f}")
+    z_diff = np.abs(pt_positions[..., 2] - onnx_positions[..., 2])
+    LOGGER.info(f"Z-coordinate difference - max: {z_diff.max():.6f}, mean: {z_diff.mean():.6f}, std: {z_diff.std():.6f}")
+    LOGGER.info("=================================")
+    # Collect validation results for table output
+    validation_results = []
+    for i, name in enumerate(output_names):
+        pt_output = pt_outputs[i].numpy()
+        if name in onnx_output_dict:
+            onnx_output = onnx_output_dict[name]
+        else:
+            if i < len(onnx_output_dict):
+                onnx_output = list(onnx_output_dict.values())[i]
+            else:
+                LOGGER.warning(f"No ONNX output found for {name}")
+                all_passed = False
+                continue
+        result = {"output": name, "passed": True, "failure_reason": ""}
+        # Special handling for quaternions - account for sign ambiguity
+        if name == "quaternions_rotations":
+            # Normalize both quaternion outputs to ensure they're unit quaternions
+            pt_quat_norm = np.linalg.norm(pt_output, axis=-1, keepdims=True)
+            pt_output_normalized = pt_output / np.clip(pt_quat_norm, 1e-12, None)
+            onnx_quat_norm = np.linalg.norm(onnx_output, axis=-1, keepdims=True)
+            onnx_output_normalized = onnx_output / np.clip(onnx_quat_norm, 1e-12, None)
+            # Canonicalize sign: handle edge cases where w ≈ 0
+            def canonicalize_quaternion(q):
+                """Canonicalize quaternion to ensure unique representation."""
+                abs_q = np.abs(q)
+                max_component_idx = np.argmax(abs_q, axis=-1, keepdims=True)
+                selector = np.zeros_like(q)
+                np.put_along_axis(selector, max_component_idx, 1, axis=-1)
+                max_component_sign = np.sum(q * selector, axis=-1, keepdims=True)
+                return np.where(max_component_sign < 0, -q, q)
+            pt_output_canonical = canonicalize_quaternion(pt_output_normalized)
+            onnx_output_canonical = canonicalize_quaternion(onnx_output_normalized)
+            # Compute differences with canonicalized quaternions
+            diff = np.abs(pt_output_canonical - onnx_output_canonical)
+            max_diff = np.max(diff)
+            mean_diff = np.mean(diff)
+            # Angular difference for rotations
+            dot_products = np.sum(pt_output_canonical * onnx_output_canonical, axis=-1)
+            dot_products = np.clip(np.abs(dot_products), 0.0, 1.0)
+            angular_diff_rad = 2 * np.arccos(dot_products)
+            angular_diff_deg = np.degrees(angular_diff_rad)
+            max_angular = np.max(angular_diff_deg)
+            mean_angular = np.mean(angular_diff_deg)
+            p99_angular = np.percentile(angular_diff_deg, 99)
+            quat_passed = True
+            failure_reasons = []
+            if mean_angular > angular_tolerances["mean"]:
+                quat_passed = False
+                failure_reasons.append(f"mean angular {mean_angular:.4f}° > {angular_tolerances['mean']:.4f}°")
+            if p99_angular > angular_tolerances["p99"]:
+                quat_passed = False
+                failure_reasons.append(f"p99 angular {p99_angular:.4f}° > {angular_tolerances['p99']:.4f}°")
+            if max_angular > angular_tolerances["max"]:
+                quat_passed = False
+                failure_reasons.append(f"max angular {max_angular:.4f}° > {angular_tolerances['max']:.4f}°")
+            result.update({
+                "max_diff": f"{max_diff:.6f}",
+                "mean_diff": f"{mean_diff:.6f}",
+                "p99_diff": f"{np.percentile(diff, 99):.6f}",
+                "max_angular": f"{max_angular:.4f}",
+                "mean_angular": f"{mean_angular:.4f}",
+                "p99_angular": f"{p99_angular:.4f}",
+                "passed": quat_passed,
+                "failure_reason": "; ".join(failure_reasons) if failure_reasons else ""
+            })
+            if not quat_passed:
+                all_passed = False
+        else:
+            diff = np.abs(pt_output - onnx_output)
+            max_diff = np.max(diff)
+            mean_diff = np.mean(diff)
+            p99_diff = np.percentile(diff, 99)
+            output_tolerance = tolerances.get(name, tolerance)
+            result.update({
+                "max_diff": f"{max_diff:.6f}",
+                "mean_diff": f"{mean_diff:.6f}",
+                "p99_diff": f"{p99_diff:.6f}",
+                "tolerance": f"{output_tolerance:.6f}"
+            })
+            if max_diff > output_tolerance:
+                result["passed"] = False
+                result["failure_reason"] = f"max diff {max_diff:.6f} > tolerance {output_tolerance:.6f}"
+                all_passed = False
+        validation_results.append(result)
+    # Output validation results as markdown table
+    if validation_results:
+        LOGGER.info("\n### Validation Results\n")
+        LOGGER.info("| Output | Max Diff | Mean Diff | P99 Diff | Angular Diff (°) | Status |")
+        LOGGER.info("|--------|----------|-----------|----------|------------------|--------|")
+        for result in validation_results:
+            output_name = result["output"].replace("_", " ").title()
+            max_diff = result["max_diff"]
+            mean_diff = result["mean_diff"]
+            p99_diff = result["p99_diff"]
+            if "max_angular" in result:
+                angular_info = f"{result['max_angular']} / {result['mean_angular']} / {result['p99_angular']}"
+            else:
+                angular_info = "-"
+            status = "✅ PASS" if result["passed"] else f"❌ FAIL"
+            if result["failure_reason"]:
+                status += f" ({result['failure_reason']})"
+            LOGGER.info(f"| {output_name} | {max_diff} | {mean_diff} | {p99_diff} | {angular_info} | {status} |")
+        LOGGER.info("")
+    return all_passed
+def main():
+    """Main conversion script."""
+    parser = argparse.ArgumentParser(
+        description="Convert SHARP PyTorch model to ONNX format"
+    )
+    parser.add_argument(
+        "-c", "--checkpoint",
+        type=Path,
+        default=None,
+        help="Path to PyTorch checkpoint. Downloads default if not provided.",
+    )
+    parser.add_argument(
+        "-o", "--output",
+        type=Path,
+        default=Path("sharp.onnx"),
+        help="Output path for ONNX model (default: sharp.onnx)",
+    )
+    parser.add_argument(
+        "--height",
+        type=int,
+        default=1536,
+        help="Input image height (default: 1536)",
+    )
+    parser.add_argument(
+        "--width",
+        type=int,
+        default=1536,
+        help="Input image width (default: 1536)",
+    )
+    parser.add_argument(
+        "--validate",
+        action="store_true",
+        help="Validate ONNX model against PyTorch",
+    )
+    parser.add_argument(
+        "-v", "--verbose",
+        action="store_true",
+        help="Enable verbose logging",
+    )
+    args = parser.parse_args()
+    # Configure logging
+    logging.basicConfig(
+        level=logging.DEBUG if args.verbose else logging.INFO,
+        format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
+    )
+    # Load PyTorch model
+    LOGGER.info("Loading SHARP model...")
+    predictor = load_sharp_model(args.checkpoint)
+    # Setup conversion parameters
+    input_shape = (args.height, args.width)
+    # Convert to ONNX
+    LOGGER.info(f"Converting to ONNX: {args.output}")
+    convert_to_onnx(predictor, args.output, input_shape=input_shape)
+    LOGGER.info(f"ONNX model saved to {args.output}")
+    # Validate if requested
+    if args.validate:
+        if args.output.exists():
+            validation_passed = validate_onnx_model(args.output, predictor, input_shape)
+            if validation_passed:
+                LOGGER.info("✓ Validation passed!")
+            else:
+                LOGGER.error("✗ Validation failed!")
+                return 1
+        else:
+            LOGGER.error(f"ONNX model not found at {args.output} for validation")
+            return 1
+    LOGGER.info("Conversion complete!")
+    return 0
+if __name__ == "__main__":
+    exit(main())