Instructions to use mlx-community/Lens-3.8B-8bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/Lens-3.8B-8bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Lens-3.8B-8bit mlx-community/Lens-3.8B-8bit
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
Add Lens-3.8B-8bit DiT weights + model card
Browse files- .gitattributes +1 -0
- README.md +32 -6
- config.json +38 -0
- model.safetensors +3 -0
- sample.png +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
sample.png filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -2,14 +2,40 @@
|
|
| 2 |
license: mit
|
| 3 |
library_name: mlx
|
| 4 |
pipeline_tag: text-to-image
|
| 5 |
-
tags: [mlx, text-to-image, diffusion, lens, apple-silicon]
|
| 6 |
base_model: microsoft/Lens
|
| 7 |
---
|
| 8 |
|
| 9 |
-
# Lens-3.8B-8bit (MLX)
|
| 10 |
|
| 11 |
-
|
| 12 |
-
text-to-image DiT
|
| 13 |
|
| 14 |
-
|
| 15 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
license: mit
|
| 3 |
library_name: mlx
|
| 4 |
pipeline_tag: text-to-image
|
| 5 |
+
tags: [mlx, text-to-image, diffusion, lens, apple-silicon, quantized]
|
| 6 |
base_model: microsoft/Lens
|
| 7 |
---
|
| 8 |
|
| 9 |
+
# Lens-3.8B-8bit (MLX)
|
| 10 |
|
| 11 |
+
Apple **MLX** conversion of the [`microsoft/Lens`](https://huggingface.co/microsoft/Lens)
|
| 12 |
+
3.8B text-to-image DiT — **int8 (group_size 64), keeping the small in/out/time projections at bf16. Higher-fidelity quant, ~2x smaller than bf16.** ~4.39 GB.
|
| 13 |
|
| 14 |
+
This repo contains the **DiT only** (MIT). The full pipeline also uses the GPT-OSS-20B text
|
| 15 |
+
encoder (Apache-2.0) and the FLUX.2 semantic VAE, pulled from source by the loader (see
|
| 16 |
+
**License**). Full-precision: [Lens-3.8B-bf16](https://huggingface.co/mlx-community/Lens-3.8B-bf16).
|
| 17 |
+
|
| 18 |
+
Fidelity: single-pass DiT cosine **0.99998** vs the PyTorch reference. Quantization changes the
|
| 19 |
+
denoise trajectory, so quantized samples differ in composition from bf16 but are equally sharp.
|
| 20 |
+
|
| 21 |
+

|
| 22 |
+
|
| 23 |
+
## Usage
|
| 24 |
+
|
| 25 |
+
```python
|
| 26 |
+
import mlx.core as mx
|
| 27 |
+
from lens_mlx.pipeline_mlx import LensPipeline # github.com/xocialize-code/lens-mlx
|
| 28 |
+
|
| 29 |
+
pipe = LensPipeline.from_pretrained("path/to/Lens", quantize_bits=8)
|
| 30 |
+
img = pipe("A serene lake below snow-capped mountains, golden hour.",
|
| 31 |
+
height=1024, width=1024, num_inference_steps=20, seed=42)
|
| 32 |
+
img.save("out.png")
|
| 33 |
+
```
|
| 34 |
+
|
| 35 |
+
## License
|
| 36 |
+
- **DiT weights (this repo):** MIT, from `microsoft/Lens`.
|
| 37 |
+
- **GPT-OSS-20B encoder:** Apache-2.0 (reuse `mlx-community/gpt-oss-20b-MXFP4-*`).
|
| 38 |
+
- **FLUX.2 VAE:** its own (FLUX.2-dev) terms — **not re-hosted**; fetched from source.
|
| 39 |
+
|
| 40 |
+
Upstream: [microsoft/Lens](https://huggingface.co/microsoft/Lens) ·
|
| 41 |
+
MLX port: [xocialize-code/lens-mlx](https://github.com/xocialize-code/lens-mlx)
|
config.json
ADDED
|
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_class_name": "LensTransformer2DModel",
|
| 3 |
+
"_diffusers_version": "0.37.1",
|
| 4 |
+
"attention_head_dim": 64,
|
| 5 |
+
"axes_dims_rope": [
|
| 6 |
+
8,
|
| 7 |
+
28,
|
| 8 |
+
28
|
| 9 |
+
],
|
| 10 |
+
"enc_hidden_dim": 2880,
|
| 11 |
+
"gate_mlp": true,
|
| 12 |
+
"in_channels": 128,
|
| 13 |
+
"inner_dim": 1536,
|
| 14 |
+
"multi_layer_encoder_feature": true,
|
| 15 |
+
"num_attention_heads": 24,
|
| 16 |
+
"num_layers": 48,
|
| 17 |
+
"out_channels": 32,
|
| 18 |
+
"patch_size": 2,
|
| 19 |
+
"rms_norm": true,
|
| 20 |
+
"selected_layer_index": [
|
| 21 |
+
5,
|
| 22 |
+
11,
|
| 23 |
+
17,
|
| 24 |
+
23
|
| 25 |
+
],
|
| 26 |
+
"mlx_format": true,
|
| 27 |
+
"quantization": {
|
| 28 |
+
"group_size": 64,
|
| 29 |
+
"bits": 8,
|
| 30 |
+
"keep_hi_precision": [
|
| 31 |
+
"img_in",
|
| 32 |
+
"txt_in",
|
| 33 |
+
"proj_out",
|
| 34 |
+
"time_text_embed",
|
| 35 |
+
"norm_out"
|
| 36 |
+
]
|
| 37 |
+
}
|
| 38 |
+
}
|
model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:95d9af088a22132a1ab555bffff3861d0613d0537a65897eddbc0f21e685444a
|
| 3 |
+
size 4386667927
|
sample.png
ADDED
|
Git LFS Details
|