mlboydaisuke's picture
Card: mark macOS-only (4B over iPhone memory limit); fix app link to zoo apps/CoreAIImageGen
592d3ab verified
|
Raw
History Blame Contribute Delete
3.03 kB
---
license: apache-2.0
base_model: black-forest-labs/FLUX.2-klein-4B
tags:
- core-ai
- coreai
- text-to-image
- flux
- flux2
- on-device
- apple-silicon
pipeline_tag: text-to-image
library_name: coreai
---
# FLUX.2 klein 4B — Core AI
[Black Forest Labs' **FLUX.2 [klein] 4B**](https://huggingface.co/black-forest-labs/FLUX.2-klein-4B)
converted to **Core AI** for on-device image generation on Apple Silicon (macOS 27+),
running on Apple's official diffusion runtime in
[apple/coreai-models](https://github.com/apple/coreai-models).
FLUX.2 [klein] is step-distilled: **4 denoising steps at guidance 1.0** produce a full
1024×1024 image. It pairs a 4B flow-matching diffusion transformer (DiT) with an 8B
Qwen3 text encoder.
> **macOS only.** At 4B the peak footprint (~6.5 GB — the text encoder stays resident
> through the transformer) exceeds a 12 GB iPhone's ~6.1 GB per-process memory limit, even
> with the transformer AOT-compiled. Use a smaller diffusion model (e.g. Stable Diffusion
> 0.9B) for on-device iOS image generation.
## Components
| Component | Description |
| --- | --- |
| `Transformer.aimodel` | Flow-matching DiT (25 blocks), 1024×1024 |
| `TextEncoder.aimodel` | Qwen3 text encoder (hidden states 9 / 18 / 27) |
| `VAEDecoder.aimodel` | Latent → 1024×1024 RGB image |
| `VAEEncoder.aimodel` | 1024×1024 RGB image → latent (image-to-image) |
| `tokenizer/`, `pipeline.json`, `vae_bn_*.npy` | Sidecar assets (auto-loaded) |
Weights are 4-bit quantized (int4, per-block, block size 32); compute precision
float16. The full bundle is **4.0 GB** — Transformer 2.0 GB · TextEncoder 1.8 GB ·
VAE 0.16 GB.
## Usage
### Sample app (easiest)
[**CoreAIImageGen** (macOS)](https://github.com/john-rocky/coreai-model-zoo/tree/main/apps/CoreAIImageGen)
— run the `CoreAIImageGenMac` scheme, tap **Download & Load**, type a prompt, **Generate**.
### Swift
```swift
import CoreAIDiffusionPipeline
let pipeline = try await Flux2Pipeline(from: modelURL)
let config = PipelineConfiguration(
prompt: "a photo of a cat",
stepCount: 4,
guidanceScale: 1.0,
schedulerType: .discreteFlow
)
let result = try await pipeline.generateImages(configuration: config) { _ in true }
let image = result.images.first!
```
### Command line (zoo reference tool)
```bash
swift run -c release diffusion-runner \
--model path/to/FLUX.2-klein-4B \
--prompt "a photo of a cat" --steps 4 --guidance-scale 1.0
```
## How it was converted
```bash
uv run coreai.diffusion.export flux2-klein-4b --platform macOS
```
## Performance
M4 Max (128 GB): **~17 s** for a 4-step 1024×1024 image (cold model load + 4 denoising
steps + VAE decode). The distilled 4-step schedule means no negative prompt / CFG is
needed (guidance 1.0).
## License
Apache 2.0, inherited from the base model
[black-forest-labs/FLUX.2-klein-4B](https://huggingface.co/black-forest-labs/FLUX.2-klein-4B).
The converted weights are redistributed under the same terms, with attribution to
Black Forest Labs.