mlboydaisuke commited on
Commit
76dbca3
·
verified ·
1 Parent(s): 8e6791d

NAFNet-GoPro-width32 LiteRT fp16 (fully-GPU deblur, Pixel 8a corr 1.0)

Browse files
Files changed (4) hide show
  1. .gitattributes +1 -0
  2. README.md +72 -0
  3. nafnet_fp16.tflite +3 -0
  4. samples/sample.png +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ samples/sample.png filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ library_name: LiteRT
4
+ pipeline_tag: image-to-image
5
+ tags:
6
+ - litert
7
+ - tflite
8
+ - on-device
9
+ - android
10
+ - gpu
11
+ - image-restoration
12
+ - deblurring
13
+ - nafnet
14
+ base_model: megvii-research/NAFNet
15
+ ---
16
+
17
+ # NAFNet-GoPro-width32 — LiteRT (on-device image deblur, fully-GPU)
18
+
19
+ [NAFNet](https://github.com/megvii-research/NAFNet) (Nonlinear Activation Free Network, ECCV 2022) image
20
+ restoration, converted to **LiteRT** and running **fully on the `CompiledModel` GPU** (ML Drift) on Android.
21
+ NAFNet is a U-Net of **NAFBlocks** with **no activation functions at all** (SimpleGate = channel-split
22
+ multiply), so the whole network is a clean CNN on the GPU delegate. This is the **GoPro-width32** variant —
23
+ motion deblur.
24
+
25
+ ![NAFNet — blurry input | restored (on-device LiteRT GPU)](samples/sample.png)
26
+
27
+ ## On-device (Pixel 8a, Tensor G3 — verified)
28
+
29
+ | | |
30
+ |---|---|
31
+ | nodes on GPU | **2179 / 2179** LITERT_CL (full residency) |
32
+ | inference | **~42 ms** (256×256) |
33
+ | size | 38 MB (fp16) |
34
+ | accuracy | device output **== PyTorch (corr 1.000000)** — re-authoring is numerically exact |
35
+
36
+ ```
37
+ image[1,3,256,256] (RGB [0,1]) →[GPU: NAFNet U-Net]→ restored[1,3,256,256]
38
+ ```
39
+
40
+ ## Usage (Android, LiteRT CompiledModel)
41
+
42
+ ```kotlin
43
+ val model = CompiledModel.create(modelPath, CompiledModel.Options(Accelerator.GPU), null)
44
+ val input = model.createInputBuffers(); val output = model.createOutputBuffers()
45
+ input[0].writeFloat(chw) // [1,3,256,256] RGB in [0,1], NCHW
46
+ model.run(input, output)
47
+ val restored = output[0].readFloat() // [1,3,256,256] in [0,1]
48
+ ```
49
+
50
+ A complete Android sample (image picker + before/after) is in the official
51
+ [google-ai-edge/litert-samples](https://github.com/google-ai-edge/litert-samples) repo under
52
+ `compiled_model_api/image_restoration`.
53
+
54
+ ## How it converts (litert-torch)
55
+
56
+ NAFNet is fully convolutional (any size that is a multiple of 16; exported here at 256×256). Three
57
+ numerically-exact GPU re-authorings:
58
+
59
+ 1. **`LayerNorm2d` → fp16-safe channel LayerNorm.** NAFNet's residual stream grows large (|x|≈175 at the
60
+ bottleneck), so the LayerNorm channel reductions `Σ_c x` and `Σ_c (x−μ)²` (~15M) **overflow fp16 (max
61
+ 65504)** on the Mali delegate (which computes in fp16 regardless of the model dtype) → a grid artifact.
62
+ Doing the reductions in a down-scaled `x/S` domain (S=128) and rescaling is numerically exact and fp16-safe.
63
+ 2. **Simplified Channel Attention `AdaptiveAvgPool2d(1)` → `mean(3).mean(2)`** (two single-axis means).
64
+ 3. **Upsample `Conv2d(1×1)+PixelShuffle(2)` → Conv2d + depth-to-space `ZeroStuffConvT2d`**.
65
+
66
+ Result: banned ops NONE, all tensors ≤4D, tflite-vs-torch corr **1.0**, device-vs-torch corr **1.0**.
67
+
68
+ ## License
69
+
70
+ [MIT](https://github.com/megvii-research/NAFNet/blob/main/LICENSE). Upstream:
71
+ [megvii-research/NAFNet](https://github.com/megvii-research/NAFNet). Original weights:
72
+ NAFNet-GoPro-width32 from the official release.
nafnet_fp16.tflite ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8e749782f7c8a6e462ba7b457724edd089216fa3a25f9d5f83d0caaae7158507
3
+ size 38314240
samples/sample.png ADDED

Git LFS Details

  • SHA256: 192d6749081a2c547bba4944f8ded9466955aeb007c4862f5ecf5e189d9c7ac8
  • Pointer size: 131 Bytes
  • Size of remote file: 152 kB