File size: 1,308 Bytes
f0b1128
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
---
license: apache-2.0
library_name: mlx
pipeline_tag: text-generation
base_model: Qwen/Qwen3-Next-80B-A3B-Thinking
tags:
  - mlx
  - qwen3_next
  - 6-bit
  - affine
  - text-generation
quantization_config:
  bits: 6
  mode: affine
  group_size: 64
model-index:
  - name: Qwen3-Next-80B-A3B-Thinking 6-bit (MLX)
    results: []
---

# Qwen3-Next-80B-A3B-Thinking — MLX 6-bit (affine)

Apple MLX-optimized 6-bit affine-quantized checkpoint of the base model
`Qwen/Qwen3-Next-80B-A3B-Thinking` for local inference on Apple Silicon.

Key details
- Format: MLX runtime, safetensors sharded weights
- Quantization: affine int6, group_size=64
- Task: text generation / chat
- Tokenizer: provided via `tokenizer.json` (BPE) with `chat_template.jinja`

## Usage (MLX)
```bash
pip install mlx-lm
```

```python
from mlx_lm import load, generate
repo_id = "abnormalmapstudio/Qwen3-Next-80B-A3B-Thinking-6bit-mlx"
model, tokenizer = load(repo_id)
out = generate(model, tokenizer, "List 5 creative dinner ideas.", max_tokens=200)
print(out)
```

## Benchmarks
- Will be added after upload completes; see `scripts/bench/qwen_mxfp4_vs_int4.py` and `scripts/bench/model_queue_eval.py`.

## License
- Apache-2.0 for this packaging. See `LICENSE`.
- Base model license and terms apply (Qwen/Qwen3-Next-80B-A3B-Thinking).