File size: 1,052 Bytes
c61781c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
license: apache-2.0
base_model: Qwen/Qwen3.5-0.8B
tags:
- qwen3.5
- text-only
- vllm
---

# Qwen3.5-0.8B Text-Only

Text-only weights extracted from [Qwen/Qwen3.5-0.8B](https://huggingface.co/Qwen/Qwen3.5-0.8B) (VLM) for use with vLLM's `Qwen3_5ForCausalLM` architecture.

## What this is

Qwen3.5 models are natively multimodal (VLM). Their HuggingFace checkpoints use `Qwen3_5ForConditionalGeneration` with weights prefixed as `model.language_model.*`. This repo provides the **language model backbone only**, with:

- `architectures: ["Qwen3_5ForCausalLM"]`
- `model_type: "qwen3_5_text"`
- Weight keys at `model.layers.*` (standard causal LM format, no `language_model.` prefix)
- Vision encoder and MTP weights removed

## Model structure

- **Architecture**: Hybrid GatedDeltaNet (24 layers) + Full Attention (8 layers)
- **Parameters**: ~0.8B (language model only, no vision encoder)
- **Dtype**: bfloat16

## How to use with vLLM

```python
from vllm import LLM
llm = LLM(model="codecho/Qwen3.5-0.8B-text-only", trust_remote_code=True)
```