AdcSR-CoreAI / README.md
mlboydaisuke's picture
Mirror of mlboydaisuke/AdcSR-CoreAI
3c182bd verified
|
Raw
History Blame Contribute Delete
4.67 kB
metadata
license: openrail++
library_name: coreai
pipeline_tag: image-to-image
tags:
  - super-resolution
  - diffusion
  - core-ai
  - apple
  - on-device
  - adcsr
  - stable-diffusion

Mirror of mlboydaisuke/AdcSR-CoreAI β€” the canonical repo (CoreAI Model Zoo). Updates land there first.

AdcSR Γ—4 Super-Resolution β€” Core AI

On-device Γ—4 super-resolution with AdcSR (Adversarial Diffusion Compression, CVPR 2025) converted for Apple's Core AI stack. AdcSR compresses the one-step diffusion model OSEDiff into a small diffusion-GAN: a pruned Stable Diffusion 2.1 UNet + a half-size VAE decoder, run in one forward pass β€” no iterative denoising, no prompt, no noise β€” so it is fast and small enough to run fully on-device, including iPhone.

Use it

▢️ Run it (source) β€” the UpscaleDemo runner (pick a photo, upscale it Γ—4 on-device):

git clone https://github.com/john-rocky/coreai-kit
open coreai-kit/Examples/UpscaleDemo/UpscaleDemo.xcodeproj
# β†’ Run, pick a photo β€” the app loads AdcSR Γ—4 (the catalog's superResolution entry) automatically

# agents / headless (macOS):
cd coreai-kit/Examples/UpscaleDemo
swift run upscale-cli --model adcsr-x4 --image sample_small.png --output big.png

πŸ’» Build with it β€” complete; the glue is kit API, copy-paste runs:

import CoreAIKitVision

let resolver = try await SuperResolver(catalog: "adcsr-x4")
let image = try ImageFile.load(imageURL)  // any image file β†’ CGImage + EXIF orientation
let upscaled = try await resolver.upscale(image.cgImage)
// upscaled: CGImage β€” 4Γ— the input's pixels

The take-home is Examples/UpscaleDemo/Sources/QuickStart.swift β€” this exact code as one typed function, no UI; the CLI is an argument shell over it, and the GUI runs the same resolver on the photo you pick. Big photos? Inputs are tiled and feather-blended internally; maxInputSide (default 512) caps the input first so a full-res phone photo can't produce a gigapixel result.

Integration checklist

  • SPM: https://github.com/john-rocky/coreai-kit β†’ product CoreAIKitVision
  • Info.plist: none needed
  • Entitlements: none needed
  • First run downloads the model β€” 1.7 GB (Mac) / 1.7 GB (iPhone) β€” then it loads from the local cache (Application Support; progress via the downloadProgress callback)
  • Measure in Release β€” Debug is ~3Γ— slower on per-token host work

What it is

  • fp32, ~1.7 GB. Output matches the torch reference (cosine 1.000012). fp32 because the pruned SD-2.1 UNet's attention/group-norm overflow in fp16 (NaN on smooth tiles).
  • Image β†’ image, one step. Input a low-resolution tile, get a 4Γ— tile back. No text, no noise.
  • 456 M parameters (pruned SD-2.1 UNet + half VAE decoder).
  • The graph outputs the raw SR; AdcSR's per-image color-match is applied host-side by SuperResolver after tiling (baking it per-tile blows up uniform tiles).

I/O contract (per tile)

  • input: lr [1,3,128,128] in [-1,1] (a low-resolution tile).
  • output: sr [1,3,512,512] in [-1,1] (Γ—4), with the reference's per-image color-match baked in.

Usage (CoreAIKit)

import CoreAIKitVision

let sr = try await SuperResolver(model: .adcsrX4)   // downloads this repo on first use
let big = try await sr.upscale(cgImage)             // Γ—4; tiles any-size input + feather-blends

SuperResolver splits any-size input into overlapping 128-px LR windows, runs each, and blends (and caps very large inputs so the result stays a reasonable size).

License & attribution

  • AdcSR (method + the pruning/training code): Apache-2.0 β€” Bingchen Li et al., Adversarial Diffusion Compression for Real-World Image Super-Resolution, CVPR 2025.
  • Weights are derived from Stable Diffusion 2.1 (via OSEDiff) and therefore carry the CreativeML Open RAIL++-M license β€” commercial use is permitted under its use-based restrictions, the same license under which Apple distributes Stable Diffusion for Core ML.

This Core AI conversion inherits both. See LICENSE (Apache-2.0, AdcSR) and the SD-2.1 OpenRAIL++-M terms.