Image-to-Image
MLX
Safetensors
English
Chinese
qwen2_5_vl
apple-silicon
lance
bytedance
multimodal
text-to-image
image-editing
vqa
qwen2.5-vl
quantized
8-bit precision
Instructions to use mlx-community/Lance-3B-8bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/Lance-3B-8bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Lance-3B-8bit mlx-community/Lance-3B-8bit
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
Update banner: mark superseded by Lance-3B-AWQ-INT4 for VQA; reflect 5c-3h research closure
Browse files
README.md
CHANGED
|
@@ -20,18 +20,25 @@ tags:
|
|
| 20 |
base_model: bytedance-research/Lance
|
| 21 |
---
|
| 22 |
|
| 23 |
-
> ⚠️ **
|
| 24 |
-
>
|
| 25 |
-
>
|
| 26 |
-
>
|
| 27 |
-
>
|
| 28 |
-
>
|
| 29 |
-
>
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
>
|
| 33 |
-
>
|
| 34 |
-
>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
---
|
| 37 |
|
|
|
|
| 20 |
base_model: bytedance-research/Lance
|
| 21 |
---
|
| 22 |
|
| 23 |
+
> ⚠️ **SUPERSEDED — DO NOT USE.** This 8-bit checkpoint produces visibly degraded t2i
|
| 24 |
+
> output (ghost subject + rainbow striped artifacts vs bf16). Kept on HF for historical
|
| 25 |
+
> reproducibility of the May 2026 quantization research record only.
|
| 26 |
+
>
|
| 27 |
+
> **What to use instead:**
|
| 28 |
+
> - For full-quality `t2i` / `image_edit` / `x2t_image`: [`mlx-community/Lance-3B-bf16`](https://huggingface.co/mlx-community/Lance-3B-bf16) (~15 GB)
|
| 29 |
+
> - For compressed `x2t_image` (VQA) on 8-16 GB Macs: [`mlx-community/Lance-3B-AWQ-INT4`](https://huggingface.co/mlx-community/Lance-3B-AWQ-INT4) (5.65 GB repo, 3.31 GB LLM, 6-9× faster decode)
|
| 30 |
+
> - For image generation on small RAM: **no quantized variant is shippable** — use bf16 on a Mac that fits it. Phase 5c-3h showed the 80% HF detail loss is architectural (forward-pass error compounding through Lance's 2,160 evaluations per image), not a quant-scheme problem.
|
| 31 |
+
|
| 32 |
+
> 🎓 **Quantization research closed (2026-05-26).** The May 2026 effort
|
| 33 |
+
> investigated naive groupwise 4/8-bit, DWQ (4-bit UND-only), and AWQ
|
| 34 |
+
> (4-bit + 8-bit, full + UND-only) across multiple configurations. AWQ math
|
| 35 |
+
> is correct per-Linear (Phase 5c-3h empirical confirmation: -28% output MSE
|
| 36 |
+
> average at 8-bit) but per-step quant improvements don't compound through
|
| 37 |
+
> Lance's flow-matching architecture. No quant scheme tested would close
|
| 38 |
+
> the t2i gap; k-quants from llama.cpp would face the same compounding
|
| 39 |
+
> problem. **Lance-3B-AWQ-INT4 is the final shipping outcome — VQA only.**
|
| 40 |
+
> Full research record: [`xocialize/lance-mlx`](https://github.com/xocialize/lance-mlx)
|
| 41 |
+
> under `notes/phase5n_diagnostics/phase5c3_awq_port/`.
|
| 42 |
|
| 43 |
---
|
| 44 |
|