Image Segmentation
LiteRT
LiteRT
on-device
android
background-removal
salient-object-detection
image-matting
u2net
Instructions to use mlboydaisuke/U-2-Net-LiteRT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LiteRT
How to use mlboydaisuke/U-2-Net-LiteRT with LiteRT:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| library_name: litert | |
| pipeline_tag: image-segmentation | |
| base_model: xuebinqin/U-2-Net | |
| tags: | |
| - litert | |
| - tflite | |
| - on-device | |
| - android | |
| - background-removal | |
| - salient-object-detection | |
| - image-matting | |
| - u2net | |
| # U²-Net — LiteRT (TFLite) GPU, FP16 | |
| On-device [LiteRT](https://ai.google.dev/edge/litert) (`.tflite`) conversion of | |
| **[U²-Net](https://github.com/xuebinqin/U-2-Net)** for salient-object segmentation / | |
| **background removal**. U²-Net is a nested U-structure ("U-net of U-nets", a pure CNN) | |
| that predicts a single-channel saliency mask; the foreground is composited onto | |
| transparency to cut the subject out of its background. | |
| The model runs **fully on the LiteRT `CompiledModel` GPU accelerator** (ML Drift): | |
| every op is GPU-native, no CPU fallback, no Flex ops. It converts with | |
| [`litert-torch`](https://github.com/google-ai-edge/ai-edge-torch) **with no custom | |
| rewrites** (pure CNN). | |
| ## Files | |
| | File | Size | Description | | |
| |------|------|-------------| | |
| | `u2net_fp16.tflite` | 88 MB | float16 weights, GPU-compatible | | |
| ## I/O | |
| - **Input**: `[1, 3, 320, 320]` float32, **NCHW**, RGB. Preprocessing: resize to 320×320, | |
| divide by the per-image max, then ImageNet normalize | |
| (`mean = [0.485, 0.456, 0.406]`, `std = [0.229, 0.224, 0.225]`). | |
| - **Output**: `[1, 1, 320, 320]` saliency mask in `[0, 1]` (sigmoid). Upscale to the input | |
| size and use as the foreground alpha. | |
| ## Usage (Android, LiteRT CompiledModel) | |
| ```kotlin | |
| val model = CompiledModel.create( | |
| context.assets, "u2net_fp16.tflite", | |
| CompiledModel.Options(Accelerator.GPU), null | |
| ) | |
| val inputs = model.createInputBuffers() | |
| val outputs = model.createOutputBuffers() | |
| inputs[0].writeFloat(nchwFloatArray) // [1,3,320,320] | |
| model.run(inputs, outputs) | |
| val mask = outputs[0].readFloat() // [1,1,320,320] in [0,1] | |
| ``` | |
| A complete Android sample (live camera + gallery background removal) is available in | |
| [google-ai-edge/litert-samples](https://github.com/google-ai-edge/litert-samples). | |
| ## Performance | |
| - ~147 ms / frame on a Pixel 8a (Tensor G3, Mali) GPU. | |
| ## Conversion notes | |
| Converted with `litert-torch` (full U2NET, 44M params) and float16-quantized with | |
| `ai-edge-quantizer`. Verified: all ops GPU-native, output correlation = 1.0 vs the PyTorch | |
| reference (FP32), ~0.9999 for the FP16 build. | |
| ## License & attribution | |
| - License: **Apache-2.0** (© the U²-Net authors, | |
| [xuebinqin/U-2-Net](https://github.com/xuebinqin/U-2-Net/blob/master/LICENSE)). | |
| - This is a format conversion of the official U²-Net weights (no architectural changes); | |
| all credit to the original authors. | |