Qwen 3.5 122B-A10B — JANG_2S (Mixed-Precision, 2-bit)
JANG — Jang Adaptive N-bit Grading | Mixed-Precision Quantization for Apple Silicon
Osaurus natively supports JANG models. Download at osaurus.ai.
Model Details
| Property | Value |
|---|---|
| Base Model | Qwen 3.5 VL 122B-A10B |
| Architecture | MoE Transformer + Vision |
| Total Parameters | 122B (10B active per token) |
| Profile | JANG_2S |
| Avg Bits/Weight | 2.11 |
| Bit Widths Used | 2, 4, 6 |
| Model Size | 30.7 GB |
| Vision | Yes |
| Format | JANG v2 (MLX-native safetensors) |
Benchmarks
200-question MMLU (20 per subject x 10 subjects). Thinking OFF (enable_thinking=False), greedy decoding (temp=0.0).
| Model | MMLU | Size |
|---|---|---|
| JANG_2S (this) | 79% | 30.7 GB |
| MLX 2-bit | 56.5% | 36 GB |
| JANG_4K | 86% | 57.4 GB |
| MLX 4-bit | 85% | 64 GB |
JANG_2S scores 79% vs MLX 2-bit's 56.5% — a +22.5 point improvement while being smaller (30.7 GB vs 36 GB). Fits on 48 GB Macs where MLX 4-bit cannot.
Per-Subject Breakdown
| Subject | JANG_2S |
|---|---|
| Abstract Algebra | 9/20 |
| Anatomy | 18/20 |
| Astronomy | 20/20 |
| College CS | 14/20 |
| College Physics | 15/20 |
| HS Biology | 19/20 |
| HS Chemistry | 18/20 |
| HS Mathematics | 11/20 |
| Logical Fallacies | 16/20 |
| World Religions | 18/20 |
| Total | 158/200 (79%) |
JANG_2S Profile
JANG_2S is an aggressive 2-bit mixed-precision profile that makes the 122B parameter model runnable on a single Mac. Critical layers (attention, routing, embeddings) are protected at higher precision while expert MLP weights are compressed to 2-bit.
Usage
# Requires Osaurus (https://osaurus.ai)
osaurus serve OsaurusAI/Qwen3.5-122B-A10B-JANG_2S
Requirements
- Apple Silicon Mac with 48+ GB unified memory
- MLX framework with Qwen 3.5 MoE support
Quantized by Osaurus AI using JANG
- Downloads last month
- 191
Model size
11B params
Tensor type
U32
·
F16 ·
Hardware compatibility
Log In to add your hardware
Quantized