INTELLECT-3.1 AWQ - INT4

Model Details

Quantization Details

Memory Usage

Type INTELLECT-3.1 INTELLECT-3.1-AWQ-4bit
Memory Size 199.0 GB 59.0 GB

Inference

Prerequisite

pip install -U vllm

Basic Usage

vllm serve cyankiwi/INTELLECT-3.1-AWQ-4bit \
    --tensor-parallel-size 2 \
    --enable-auto-tool-choice \
    --tool-call-parser qwen3_coder \
    --reasoning-parser deepseek_r1

Additional Information

Known Issues

  • tensor-parallel-size > 2 requires --enable-expert-parallel
  • No MTP implementation

Changelog

  • v0.9.0 - Initial quantized release without MTP implementation

Authors

INTELLECT-3.1

Prime Intellect Logo

INTELLECT-3.1: A 100B+ MoE trained with large-scale RL

Trained with prime-rl and verifiers
Environments released on Environments Hub
Read the Blog & Technical Report
X | Discord | Prime Intellect Platform

Introduction

INTELLECT-3.1 is a 106B (A12B) parameter Mixture-of-Experts reasoning model built as a continued training of INTELLECT-3 with additional reinforcement learning on math, coding, software engineering, and agentic tasks.

Training was performed with prime-rl using environments built with the verifiers library. All training and evaluation environments are available on the Environments Hub.

The model, training frameworks, and environments are open-sourced under fully-permissive licenses (MIT and Apache 2.0).

For more details, see the technical report.

Serving with vLLM

The model can be served on 2x H200s:

vllm serve PrimeIntellect/INTELLECT-3.1 \
    --tensor-parallel-size 2 \
    --enable-auto-tool-choice \
    --tool-call-parser qwen3_coder \
    --reasoning-parser deepseek_r1

Citation

@misc{intellect3.1,
  title={INTELLECT-3.1: Technical Report},
  author={Prime Intellect Team},
  year={2025},
  url={https://huggingface.co/PrimeIntellect/INTELLECT-3.1}
}
Downloads last month
75
Safetensors
Model size
19B params
Tensor type
I64
F32
I32
BF16
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for cyankiwi/INTELLECT-3.1-AWQ-4bit

Quantized
(6)
this model