File size: 6,615 Bytes
cff3e76 9dc01c6 cff3e76 8daecfc cff3e76 9dc01c6 cff3e76 9dc01c6 cff3e76 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 | ---
library_name: coreml
pipeline_tag: image-to-image
tags:
- super-resolution
- apple-silicon
- neural-engine
- ane
- coreml
- real-time
- video-upscaling
- macos
license: cc-by-4.0
datasets:
- eugenesiow/Div2k
metrics:
- psnr
- ssim
model-index:
- name: PiperSR-2x
results:
- task:
type: image-super-resolution
name: Image Super-Resolution
dataset:
type: Set5
name: Set5
metrics:
- type: psnr
value: 37.54
name: PSNR
- task:
type: image-super-resolution
name: Image Super-Resolution
dataset:
type: Set14
name: Set14
metrics:
- type: psnr
value: 33.21
name: PSNR
- task:
type: image-super-resolution
name: Image Super-Resolution
dataset:
type: BSD100
name: BSD100
metrics:
- type: psnr
value: 31.98
name: PSNR
- task:
type: image-super-resolution
name: Image Super-Resolution
dataset:
type: Urban100
name: Urban100
metrics:
- type: psnr
value: 31.38
name: PSNR
---
# PiperSR-2x: ANE-Native Super Resolution for Apple Silicon
Real-time 2x AI upscaling on Apple's Neural Engine. 44.4 FPS at 720p on M2 Max, 928 KB model, every op runs natively on ANE with zero CPU/GPU fallback.
Not a converted PyTorch model β an architecture designed from ANE hardware measurements. Every dimension, operation, and data type is dictated by Neural Engine characteristics.
## Key Results
| Model | Params | Set5 | Set14 | BSD100 | Urban100 |
|-------|--------|------|-------|--------|----------|
| Bicubic | β | 33.66 | 30.24 | 29.56 | 26.88 |
| FSRCNN | 13K | 37.05 | 32.66 | 31.53 | 29.88 |
| **PiperSR** | **453K** | **37.54** | **33.21** | **31.98** | **31.38** |
| SAFMN | 228K | 38.00 | ~33.7 | ~32.2 | β |
Beats FSRCNN across all benchmarks. Within 0.46 dB of SAFMN on Set5 β below the perceptual threshold for most content.
## Performance
| Configuration | FPS | Hardware | Notes |
|--------------|-----|----------|-------|
| Full-frame 640Γ360 β 1280Γ720 | 44.4 | M2 Max | ANE predict 22.5 ms |
| 128Γ128 tiles (static weights) | 125.6 | M2 | Baked weights, 2.82Γ vs dynamic |
| 128Γ128 tiles (dynamic weights) | 44.5 | M2 | CoreML default |
Real-time 2Γ upscaling at 30+ FPS on any Mac with Apple Silicon. The ANE sits idle during video playback β PiperSR puts it to work.
## Architecture
453K-parameter network: 6 residual blocks at 64 channels with BatchNorm and SiLU activations, upscaling via PixelShuffle.
```
Input (128Γ128Γ3 FP16)
β Head: Conv 3Γ3 (3 β 64)
β Body: 6Γ ResBlock [Conv 3Γ3 β BatchNorm β SiLU β Conv 3Γ3 β BatchNorm β Residual Add]
β Tail: Conv 3Γ3 (64 β 12) β PixelShuffle(2)
Output (256Γ256Γ3)
```
Compiles to 5 MIL ops: `conv`, `add`, `silu`, `pixel_shuffle`, `const`. All verified ANE-native.
### Why ANE-native matters
Off-the-shelf super resolution models (SPAN, Real-ESRGAN) were designed for CUDA GPUs and converted to CoreML after the fact. They waste the ANE:
- **Misaligned channels** (48 instead of 64) waste 25%+ of each ANE tile
- **Monolithic full-frame** tensors serialize the ANE's parallel compute lanes
- **Silent CPU fallback** from unsupported ops can 5-10Γ latency
- **No batched tiles** means 60Γ dispatch overhead
PiperSR addresses every one of these by designing around ANE constraints.
## Model Variants
| File | Use Case | Input β Output |
|------|----------|----------------|
| `PiperSR_2x.mlpackage` | Static images (128px tiles) | 128Γ128 β 256Γ256 |
| `PiperSR_2x_video_720p.mlpackage` | Video (full-frame, BN-fused) | 640Γ360 β 1280Γ720 |
| `PiperSR_2x_256.mlpackage` | Static images (256px tiles) | 256Γ256 β 512Γ512 |
## Usage
### With ToolPiper (recommended)
PiperSR is integrated into [ToolPiper](https://modelpiper.com), a local macOS AI toolkit. Install ToolPiper, enable the MediaPiper browser extension, and every 720p video on the web is upscaled to 1440p in real time.
```bash
# Via MCP tool
mcp__toolpiper__image_upscale image=/path/to/image.png
# Via REST API
curl -X POST http://127.0.0.1:9998/v1/images/upscale \
-F "image=@input.png" \
-o upscaled.png
```
### With CoreML (Swift)
```swift
import CoreML
let config = MLModelConfiguration()
config.computeUnits = .cpuAndNeuralEngine // NOT .all β .all is 23.6% slower
let model = try PiperSR_2x(configuration: config)
let input = try PiperSR_2xInput(x: pixelBuffer)
let output = try model.prediction(input: input)
// output.var_185 contains the 2Γ upscaled image
```
> **Important:** Use `.cpuAndNeuralEngine`, not `.all`. CoreML's `.all` silently misroutes pure-ANE ops onto the GPU, causing a 23.6% slowdown for this model.
### With coremltools (Python)
```python
import coremltools as ct
from PIL import Image
import numpy as np
model = ct.models.MLModel("PiperSR_2x.mlpackage")
img = Image.open("input.png").resize((128, 128))
arr = np.array(img).astype(np.float32) / 255.0
arr = np.transpose(arr, (2, 0, 1))[np.newaxis] # NCHW
result = model.predict({"x": arr})
```
## Training
Trained on DIV2K (800 training images) with L1 loss and random augmentation (flips, rotations). Total training cost: ~$6 on RunPod A6000 instances. Full training journey documented from 33.46 dB to 37.54 dB across 12 experiment findings.
## Technical Details
- **Compute units:** `.cpuAndNeuralEngine` (ANE primary, CPU for I/O only)
- **Precision:** Float16
- **Input format:** NCHW, normalized to [0, 1]
- **Output format:** NCHW, [0, 1]
- **Model size:** 928 KB (compiled .mlmodelc)
- **Parameters:** 453K
- **ANE ops used:** conv, batch_norm (fused at inference), silu, add, pixel_shuffle, const
- **CPU fallback ops:** None
## License
The model weights are **CC BY 4.0** β fully permissive. Use them for anything: personal, academic, or commercial. Ship them in your app, build a product, **make money with it.** The only ask: **link back to [ModelPiper.com](https://modelpiper.com)** as attribution.
```
Powered by PiperSR from ModelPiper β https://modelpiper.com
```
## Links
- **GitHub:** https://github.com/ModelPiper/PiperSR
- **ModelPiper:** https://modelpiper.com
## Citation
```bibtex
@software{pipersr2025,
title={PiperSR: ANE-Native Super Resolution for Apple Silicon},
author={ModelPiper},
year={2026},
url={https://huggingface.co/ModelPiper/PiperSR-2x}
}
```
|