| library_name: transformers | |
| license: apache-2.0 | |
| pipeline_tag: text-generation | |
| # Qwen3-32B | |
| ## Model Overview | |
| **Qwen3-32B** has the following features: | |
| - **Type:** Causal Language Models | |
| - **Training Stage:** Pretraining & Post-training | |
| - **Number of Parameters:** 32.8B | |
| - **Number of Parameters (Non-Embedding):** 31.2B | |
| - **Number of Layers:** 64 | |
| - **Number of Attention Heads (GQA):** 64 for Q and 8 for KV | |
| - **Context Length:** 32,768 natively and 131,072 tokens with YaRN. | |