File size: 1,052 Bytes

c61781c

---
license: apache-2.0
base_model: Qwen/Qwen3.5-0.8B
tags:
- qwen3.5
- text-only
- vllm
---

# Qwen3.5-0.8B Text-Only

Text-only weights extracted from [Qwen/Qwen3.5-0.8B](https://huggingface.co/Qwen/Qwen3.5-0.8B) (VLM) for use with vLLM's `Qwen3_5ForCausalLM` architecture.

## What this is

Qwen3.5 models are natively multimodal (VLM). Their HuggingFace checkpoints use `Qwen3_5ForConditionalGeneration` with weights prefixed as `model.language_model.*`. This repo provides the **language model backbone only**, with:

- `architectures: ["Qwen3_5ForCausalLM"]`
- `model_type: "qwen3_5_text"`
- Weight keys at `model.layers.*` (standard causal LM format, no `language_model.` prefix)
- Vision encoder and MTP weights removed

## Model structure

- **Architecture**: Hybrid GatedDeltaNet (24 layers) + Full Attention (8 layers)
- **Parameters**: ~0.8B (language model only, no vision encoder)
- **Dtype**: bfloat16

## How to use with vLLM

```python
from vllm import LLM
llm = LLM(model="codecho/Qwen3.5-0.8B-text-only", trust_remote_code=True)
```