KilnImage 0.1.0: initial ERNIE-Image-Turbo iOS bundle
Browse files- .gitattributes +2 -0
- Ministral-3-3B-Instruct-2512-Q4_K_M.gguf +3 -0
- README.md +92 -0
- ae.safetensors +3 -0
- ernie-image-turbo-Q3_K_M.gguf +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
Ministral-3-3B-Instruct-2512-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
ernie-image-turbo-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
Ministral-3-3B-Instruct-2512-Q4_K_M.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fd46fc371ff0509bfa8657ac956b7de8534d7d9baaa4947975c0648c3aa397f4
|
| 3 |
+
size 2146497824
|
README.md
ADDED
|
@@ -0,0 +1,92 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language: en
|
| 4 |
+
pipeline_tag: text-to-image
|
| 5 |
+
tags:
|
| 6 |
+
- text-to-image
|
| 7 |
+
- diffusion
|
| 8 |
+
- ernie
|
| 9 |
+
- ernie-image
|
| 10 |
+
- dit
|
| 11 |
+
- turbo
|
| 12 |
+
- gguf
|
| 13 |
+
- quantized
|
| 14 |
+
- on-device
|
| 15 |
+
- ios
|
| 16 |
+
- mobile
|
| 17 |
+
- apple-silicon
|
| 18 |
+
base_model: baidu/ERNIE-Image-Turbo
|
| 19 |
+
---
|
| 20 |
+
|
| 21 |
+
# ERNIE-Image-Turbo — iOS bundle
|
| 22 |
+
|
| 23 |
+
<p align="center">
|
| 24 |
+
<a href="https://github.com/haplollc/KilnImage"><img alt="KilnImage" src="https://img.shields.io/badge/Runs%20on-KilnImage-orange" /></a>
|
| 25 |
+
<a href="https://huggingface.co/baidu/ERNIE-Image"><img alt="Upstream" src="https://img.shields.io/badge/Upstream-baidu%2FERNIE--Image-blue" /></a>
|
| 26 |
+
<img alt="License" src="https://img.shields.io/badge/license-Apache--2.0-lightgrey" />
|
| 27 |
+
<img alt="Params" src="https://img.shields.io/badge/params-8B-purple" />
|
| 28 |
+
<img alt="Distill" src="https://img.shields.io/badge/distill-Turbo-green" />
|
| 29 |
+
</p>
|
| 30 |
+
|
| 31 |
+
A mobile-friendly bundle of **ERNIE-Image-Turbo** — Baidu's 8B single-stream DiT, distilled for fast inference, with state-of-the-art text rendering quality among open-weight models. Bundled with Ministral 3B text encoder + Flux VAE for on-device inference via [**KilnImage**](https://github.com/haplollc/KilnImage).
|
| 32 |
+
|
| 33 |
+
ERNIE-Image-Turbo is particularly strong at:
|
| 34 |
+
- **Photorealism** at 1024×1024
|
| 35 |
+
- **Accurate text rendering inside images** — best-in-class among open models
|
| 36 |
+
- **Speed** — distilled to few-step sampling
|
| 37 |
+
|
| 38 |
+
## What's inside
|
| 39 |
+
|
| 40 |
+
| File | Role | Size |
|
| 41 |
+
|---|---|---|
|
| 42 |
+
| [`ernie-image-turbo-Q3_K_M.gguf`](./ernie-image-turbo-Q3_K_M.gguf) | Diffusion transformer — 8B params, Q3_K_M | 3.6 GB |
|
| 43 |
+
| [`Ministral-3-3B-Instruct-2512-Q4_K_M.gguf`](./Ministral-3-3B-Instruct-2512-Q4_K_M.gguf) | Text encoder (Mistral3 emits the 3072-dim conditioning tensor ERNIE expects) | 2.0 GB |
|
| 44 |
+
| [`ae.safetensors`](./ae.safetensors) | VAE (from FLUX.1) | 320 MB |
|
| 45 |
+
|
| 46 |
+
Total bundle: **~5.9 GB**. Total GPU residency: ~7 GB. **iPhone 16 Pro / 17 Pro / Mac** territory.
|
| 47 |
+
|
| 48 |
+
## Quick start (KilnImage)
|
| 49 |
+
|
| 50 |
+
```swift
|
| 51 |
+
import KilnImage
|
| 52 |
+
|
| 53 |
+
let docs = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask)[0]
|
| 54 |
+
|
| 55 |
+
let engine = try Engine(models: ModelFiles(
|
| 56 |
+
diffusionModel: docs.appendingPathComponent("ernie-image-turbo-Q3_K_M.gguf"),
|
| 57 |
+
vae: docs.appendingPathComponent("ae.safetensors"),
|
| 58 |
+
textEncoder: docs.appendingPathComponent("Ministral-3-3B-Instruct-2512-Q4_K_M.gguf")
|
| 59 |
+
))
|
| 60 |
+
|
| 61 |
+
let image = try await engine.generate(.init(
|
| 62 |
+
prompt: "a vintage diner sign that reads \"OPEN 24/7\" in red neon, dusk lighting, photorealistic",
|
| 63 |
+
width: 1024, height: 1024,
|
| 64 |
+
steps: 8, // Turbo distillation
|
| 65 |
+
cfgScale: 1.0 // CFG is baked in
|
| 66 |
+
))
|
| 67 |
+
```
|
| 68 |
+
|
| 69 |
+
## Why ERNIE-Image-Turbo
|
| 70 |
+
|
| 71 |
+
If you need **text inside images** that actually renders correctly (signs, labels, captions, UI mocks), this is currently the strongest open-weight option. Mid-2025 evaluations showed ERNIE-Image meeting or beating GPT-Image-1 on text rendering despite being 1/10th the size.
|
| 72 |
+
|
| 73 |
+
## Provenance
|
| 74 |
+
|
| 75 |
+
| Component | Upstream | License |
|
| 76 |
+
|---|---|---|
|
| 77 |
+
| Diffusion transformer | [baidu/ERNIE-Image](https://github.com/baidu/ERNIE-Image) | Apache 2.0 |
|
| 78 |
+
| GGUF conversion | [unsloth/ERNIE-Image-Turbo-GGUF](https://huggingface.co/unsloth/ERNIE-Image-Turbo-GGUF) | Apache 2.0 |
|
| 79 |
+
| Text encoder | [unsloth/Ministral-3-3B-Instruct-2512-GGUF](https://huggingface.co/unsloth/Ministral-3-3B-Instruct-2512-GGUF) | Apache 2.0 |
|
| 80 |
+
| VAE | [ffxvs/vae-flux](https://huggingface.co/ffxvs/vae-flux) (re-host of FLUX.1's `ae.safetensors`) | FLUX-1-dev-non-commercial |
|
| 81 |
+
|
| 82 |
+
## Performance (KilnImage, rough)
|
| 83 |
+
|
| 84 |
+
| Device | 1024² @ 8 steps |
|
| 85 |
+
|---|---|
|
| 86 |
+
| iPhone 17 Pro | ~3 min |
|
| 87 |
+
| iPhone 16 Pro | ~5 min |
|
| 88 |
+
| M2 / M3 Mac | ~6 min |
|
| 89 |
+
|
| 90 |
+
## Built by
|
| 91 |
+
|
| 92 |
+
[Haplo](https://haplo.app) · [KilnImage on GitHub](https://github.com/haplollc/KilnImage)
|
ae.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:afc8e28272cd15db3919bacdb6918ce9c1ed22e96cb12c4d5ed0fba823529e38
|
| 3 |
+
size 335304388
|
ernie-image-turbo-Q3_K_M.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3c1813fc1e0e904cc342e7b6791d0165e6dbb6aac30ad2924747b198bc435857
|
| 3 |
+
size 3909632704
|