Outlier-10B-V3.2

Status: research preview. V3.3 alpha-fix overlay coming in a future release.

A ternary-quantized Mixture-of-Experts model built as a delta overlay on Qwen/Qwen2.5-7B-Instruct. 14 MoE layers × 8 experts each, with shared base FFN and ternary delta experts.

Model summary

Field Value
Base model Qwen/Qwen2.5-7B-Instruct
Architecture Outlier MoE (overlay on Qwen2.5)
Parameters ~23B effective (delta + base)
Context length 32,768 tokens
MoE layers 14 (indices 7–20)
Experts per layer 8
Top-k routing 2
Expert quantization Ternary (int8 + per-row fp16 scale)
MMLU Not currently verified. Historical smoke test ~76%. Pending full-sample re-evaluation.

⚠️ Known issue: broken auto_map

The config.json references modeling_outlier_moe.Qwen2MoEForCausalLM but the shipped modeling_outlier_moe.py defines OutlierMoEForCausalLM. This is a class name mismatch in the HF repo. Loading via the standard path may fail.

Workaround: Load with trust_remote_code=True and use Outlier-Ai/Outlier-70B-V3.3 instead if you need a production-grade Outlier model — the 70B repo does not have this issue.

A fix for the 10B repo auto_map is pending. Until then, treat 10B as research preview, not for production.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("Outlier-Ai/Outlier-10B-V3.2", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    "Outlier-Ai/Outlier-10B-V3.2",
    trust_remote_code=True,
    torch_dtype="bfloat16",
)

If the load fails with auto_map errors, see the workaround note above.

Provenance

  • MMLU: [UNVERIFIED]. Day 12 smoke test reading of 76.19% was a lm_eval --limit 570 test, not a full-sample run. Cannot be claimed as a benchmark result. See OUTLIER_GROUND_TRUTH_v10.md §2.1.
  • Secondary benchmarks: [UNVERIFIED].

Status

Research preview. Use Outlier-Ai/Outlier-70B-V3.3 for production deployment. 10B is kept available for architecture exploration and small-scale inference testing.

License

Apache 2.0

Downloads last month
921
Safetensors
Model size
23B params
Tensor type
F16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Outlier-Ai/Outlier-10B-V3.2

Base model

Qwen/Qwen2.5-7B
Finetuned
(3191)
this model