northtech
/

32b_Qwen_llm

Text Generation

text-generation-inference

Model card Files Files and versions

32b_Qwen_llm / README.md

northtech's picture

Update model card

1577985 verified 3 months ago

|

history blame contribute delete

486 Bytes

	---
	library_name: transformers
	license: apache-2.0
	pipeline_tag: text-generation
	---

	# Qwen3-32B

	## Model Overview

	Qwen3-32B has the following features:

	- Type: Causal Language Models
	- Training Stage: Pretraining & Post-training
	- Number of Parameters: 32.8B
	- Number of Parameters (Non-Embedding): 31.2B
	- Number of Layers: 64
	- Number of Attention Heads (GQA): 64 for Q and 8 for KV
	- Context Length: 32,768 natively and 131,072 tokens with YaRN.