| --- |
| license: apache-2.0 |
| base_model: Qwen/Qwen3.5-0.8B |
| tags: |
| - qwen3.5 |
| - text-only |
| - vllm |
| --- |
| |
| # Qwen3.5-0.8B Text-Only |
|
|
| Text-only weights extracted from [Qwen/Qwen3.5-0.8B](https://huggingface.co/Qwen/Qwen3.5-0.8B) (VLM) for use with vLLM's `Qwen3_5ForCausalLM` architecture. |
|
|
| ## What this is |
|
|
| Qwen3.5 models are natively multimodal (VLM). Their HuggingFace checkpoints use `Qwen3_5ForConditionalGeneration` with weights prefixed as `model.language_model.*`. This repo provides the **language model backbone only**, with: |
|
|
| - `architectures: ["Qwen3_5ForCausalLM"]` |
| - `model_type: "qwen3_5_text"` |
| - Weight keys at `model.layers.*` (standard causal LM format, no `language_model.` prefix) |
| - Vision encoder and MTP weights removed |
|
|
| ## Model structure |
|
|
| - **Architecture**: Hybrid GatedDeltaNet (24 layers) + Full Attention (8 layers) |
| - **Parameters**: ~0.8B (language model only, no vision encoder) |
| - **Dtype**: bfloat16 |
|
|
| ## How to use with vLLM |
|
|
| ```python |
| from vllm import LLM |
| llm = LLM(model="codecho/Qwen3.5-0.8B-text-only", trust_remote_code=True) |
| ``` |
|
|