baa-ai
/

GLM-5-SWAN-5bit-MLX

@@ -4,20 +4,15 @@ tags:
   - mlx
   - quantized
   - mixed-precision
-  - swan
 license: other
 license_name: polyform-noncommercial
 base_model: THU-KEG/GLM-5-0817
 base_model_relation: quantized
 ---
-<p align="center">
-  <img src="https://huggingface.co/spaces/baa-ai/MINT/resolve/main/baa-logo.svg" width="300" alt="baa.ai">
-</p>
 # GLM-5-SWAN-5bit-MLX
-Mixed-precision quantized version of [THUDM/GLM-5](https://huggingface.co/THUDM/GLM-5) using [SWAN](https://github.com/baa-ai/MINT) | [MINT-UI](https://github.com/baa-ai/MINT-UI).
 > GLM-5 (355B parameters). Experimental.
@@ -31,25 +26,6 @@ Mixed-precision quantized version of [THUDM/GLM-5](https://huggingface.co/THUDM/
 | WikiText-2 PPL | — |
-## 🚀 Create Your Own Custom Quantization
-**Don't see the size you need?** Use [**MINT-UI**](https://github.com/baa-ai/MINT-UI) to create a custom-sized quantization targeting your exact memory budget:
-```bash
-pip install mint-ui
-mint-ui
-```
-MINT-UI analyzes any model in **under 60 seconds** using a cutting-edge allocation technique — no calibration data needed. Specify your exact memory target (e.g., "fit in 24 GB for RTX 4090") and MINT returns a near-optimal per-tensor bit-width allocation.
-- ⚡ **60 seconds** analysis (vs hours for GPTQ/AWQ calibration)
-- 🎯 **Any target size** — not limited to uniform 4-bit or 8-bit
-- 🧠 **Data-free** — no calibration dataset required
-- 💻 **Runs on any Mac** — even 32 GB machines can analyze 400B models
-👉 **[Get MINT-UI](https://github.com/baa-ai/MINT-UI)** | 📄 **[MINT Paper](https://github.com/baa-ai/MINT) | [MINT-UI](https://github.com/baa-ai/MINT-UI)** | 🤗 **[All Models](https://huggingface.co/baa-ai)**
 ## Usage
 ```python
@@ -60,11 +36,6 @@ response = generate(model, tokenizer, prompt="Hello!", max_tokens=256)
 print(response)
 ```
-## About SWAN
-SWAN uses data-free per-tensor sensitivity analysis with composite scoring to allocate bit-widths across model layers.
-- [Paper](https://huggingface.co/spaces/baa-ai/MINT) | [Code](https://github.com/baa-ai/MINT) | [MINT-UI](https://github.com/baa-ai/MINT-UI) | [Models](https://huggingface.co/baa-ai)
 ---
 *Quantized by [baa.ai](https://baa.ai)*

   - mlx
   - quantized
   - mixed-precision
 license: other
 license_name: polyform-noncommercial
 base_model: THU-KEG/GLM-5-0817
 base_model_relation: quantized
 ---
 # GLM-5-SWAN-5bit-MLX
+Mixed-precision quantized version of [THUDM/GLM-5](https://huggingface.co/THUDM/GLM-5) optimised by [baa.ai](https://baa.ai).
 > GLM-5 (355B parameters). Experimental.
 | WikiText-2 PPL | — |
 ## Usage
 ```python
 print(response)
 ```
 ---
 *Quantized by [baa.ai](https://baa.ai)*