ONNX / README.md
gafda
Correct capitalization in repository name and update link in README.md
b8a3f2c
metadata
license: mit
tags:
  - onnx
  - clip
  - lpips
  - image-similarity
  - computer-vision

ONNX Models for Vidupe.Net

This repository contains ONNX-exported models used by Vidupe.Net for visual similarity and perceptual comparison tasks.

Models

vidupe.net/models/clip_visual_vit_b32.onnx

CLIP visual encoder (ViT-B/32) exported to ONNX. This model encodes images into a 512-dimensional embedding space, enabling semantic image similarity comparisons.

vidupe.net/models/lpips_alexnet.onnx

LPIPS (Learned Perceptual Image Patch Similarity) model with an AlexNet backbone exported to ONNX. Computes perceptual distance between two image patches.

Usage

import onnxruntime as ort
import numpy as np

# CLIP visual encoder
session = ort.InferenceSession("vidupe.net/models/clip_visual_vit_b32.onnx")
image = np.random.randn(1, 3, 224, 224).astype(np.float32)
embeddings = session.run(None, {"input": image})[0]

# LPIPS perceptual similarity
session = ort.InferenceSession("vidupe.net/models/lpips_alexnet.onnx")
img0 = np.random.randn(1, 3, 64, 64).astype(np.float32)
img1 = np.random.randn(1, 3, 64, 64).astype(np.float32)
distance = session.run(None, {"input0": img0, "input1": img1})[0]

Requirements

onnxruntime>=1.16.0
numpy