Qwen3-Coder-Next-REAP-40B-A3B has the following specifications:
- Type: Causal Language Models
- Number of Parameters: 40B in total and 3B activated
- Hidden Dimension: 2048
- Number of Layers: 48
- Hybrid Layout: 12 * (3 * (Gated DeltaNet -> MoE) -> 1 * (Gated Attention -> MoE))
- Gated Attention:
- Number of Attention Heads: 16 for Q and 2 for KV
- Head Dimension: 256
- Rotary Position Embedding Dimension: 64
- Gated DeltaNet:
**Number of Linear Attention Heads: 32 for V and 16 for QK
**Head Dimension: 128 - Mixture of Experts:
- **Number of Experts: 256 (uniformly pruned from 512)
- **Number of Activated Experts: 10
- **Number of Shared Experts: 1
- Context Length: 262,144 natively
- Compression Method: REAP (Router-weighted Expert Activation Pruning)
- Compression Ratio: 50% expert pruning
- Downloads last month
- 107
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
