lovedheart
/

Qwen3-Coder-Next-REAP-40B-A3B

text-generation-inference

Model card Files Files and versions

Qwen3-Coder-Next-REAP-40B-A3B has the following specifications:

Type: Causal Language Models
Number of Parameters: 40B in total and 3B activated
Hidden Dimension: 2048
Number of Layers: 48
Hybrid Layout: 12 * (3 * (Gated DeltaNet -> MoE) -> 1 * (Gated Attention -> MoE))
Gated Attention:
Number of Attention Heads: 16 for Q and 2 for KV
Head Dimension: 256
Rotary Position Embedding Dimension: 64
Gated DeltaNet:
**Number of Linear Attention Heads: 32 for V and 16 for QK
**Head Dimension: 128
Mixture of Experts:
**Number of Experts: 256 (uniformly pruned from 512)
**Number of Activated Experts: 10
**Number of Shared Experts: 1
Context Length: 262,144 natively
Compression Method: REAP (Router-weighted Expert Activation Pruning)
Compression Ratio: 50% expert pruning

Downloads last month: 798

Safetensors

Model size

41B params

Tensor type

BF16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lovedheart/Qwen3-Coder-Next-REAP-40B-A3B

Base model

Qwen/Qwen3-Coder-Next

Finetuned

(32)

this model

Finetunes

Merges

Quantizations