Outlier-10B-V3.2

Status: research preview. V3.3 alpha-fix overlay coming in a future release.

A ternary-quantized Mixture-of-Experts model built as a delta overlay on Qwen/Qwen2.5-7B-Instruct. 14 MoE layers × 8 experts each, with shared base FFN and ternary delta experts.

Model summary

Field	Value
Base model	`Qwen/Qwen2.5-7B-Instruct`
Architecture	Outlier MoE (overlay on Qwen2.5)
Parameters	~23B effective (delta + base)
Context length	32,768 tokens
MoE layers	14 (indices 7–20)
Experts per layer	8
Top-k routing	2
Expert quantization	Ternary (int8 + per-row fp16 scale)
MMLU	Not currently verified. Historical smoke test ~76%. Pending full-sample re-evaluation.

⚠️ Known issue: broken `auto_map`

The config.json references modeling_outlier_moe.Qwen2MoEForCausalLM but the shipped modeling_outlier_moe.py defines OutlierMoEForCausalLM. This is a class name mismatch in the HF repo. Loading via the standard path may fail.

Workaround: Load with trust_remote_code=True and use Outlier-Ai/Outlier-70B-V3.3 instead if you need a production-grade Outlier model — the 70B repo does not have this issue.

A fix for the 10B repo auto_map is pending. Until then, treat 10B as research preview, not for production.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("Outlier-Ai/Outlier-10B-V3.2", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    "Outlier-Ai/Outlier-10B-V3.2",
    trust_remote_code=True,
    torch_dtype="bfloat16",
)

If the load fails with auto_map errors, see the workaround note above.

Provenance

MMLU: [UNVERIFIED]. Day 12 smoke test reading of 76.19% was a lm_eval --limit 570 test, not a full-sample run. Cannot be claimed as a benchmark result. See OUTLIER_GROUND_TRUTH_v10.md §2.1.
Secondary benchmarks: [UNVERIFIED].

Status

Research preview. Use Outlier-Ai/Outlier-70B-V3.3 for production deployment. 10B is kept available for architecture exploration and small-scale inference testing.

License

Apache 2.0

Downloads last month: 921

Safetensors

Model size

23B params

Tensor type

F16

F32

Model tree for Outlier-Ai/Outlier-10B-V3.2

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct