northtech
/

32b_Qwen_llm

Text Generation

text-generation-inference

Model card Files Files and versions

32b_Qwen_llm / README.md

northtech's picture

Update model card

1577985 verified 3 months ago

|

history blame contribute delete

486 Bytes

library_name: transformers
license: apache-2.0
pipeline_tag: text-generation

Qwen3-32B

Model Overview

Qwen3-32B has the following features:

Type: Causal Language Models
Training Stage: Pretraining & Post-training
Number of Parameters: 32.8B
Number of Parameters (Non-Embedding): 31.2B
Number of Layers: 64
Number of Attention Heads (GQA): 64 for Q and 8 for KV
Context Length: 32,768 natively and 131,072 tokens with YaRN.