32b_Qwen_llm / README.md
northtech's picture
Update model card
1577985 verified
metadata
library_name: transformers
license: apache-2.0
pipeline_tag: text-generation

Qwen3-32B

Model Overview

Qwen3-32B has the following features:

  • Type: Causal Language Models
  • Training Stage: Pretraining & Post-training
  • Number of Parameters: 32.8B
  • Number of Parameters (Non-Embedding): 31.2B
  • Number of Layers: 64
  • Number of Attention Heads (GQA): 64 for Q and 8 for KV
  • Context Length: 32,768 natively and 131,072 tokens with YaRN.