--- library_name: transformers license: apache-2.0 pipeline_tag: text-generation --- # Qwen3-32B ## Model Overview **Qwen3-32B** has the following features: - **Type:** Causal Language Models - **Training Stage:** Pretraining & Post-training - **Number of Parameters:** 32.8B - **Number of Parameters (Non-Embedding):** 31.2B - **Number of Layers:** 64 - **Number of Attention Heads (GQA):** 64 for Q and 8 for KV - **Context Length:** 32,768 natively and 131,072 tokens with YaRN.