--- license: apache-2.0 base_model: Qwen/Qwen3.5-0.8B tags: - qwen3.5 - text-only - vllm --- # Qwen3.5-0.8B Text-Only Text-only weights extracted from [Qwen/Qwen3.5-0.8B](https://huggingface.co/Qwen/Qwen3.5-0.8B) (VLM) for use with vLLM's `Qwen3_5ForCausalLM` architecture. ## What this is Qwen3.5 models are natively multimodal (VLM). Their HuggingFace checkpoints use `Qwen3_5ForConditionalGeneration` with weights prefixed as `model.language_model.*`. This repo provides the **language model backbone only**, with: - `architectures: ["Qwen3_5ForCausalLM"]` - `model_type: "qwen3_5_text"` - Weight keys at `model.layers.*` (standard causal LM format, no `language_model.` prefix) - Vision encoder and MTP weights removed ## Model structure - **Architecture**: Hybrid GatedDeltaNet (24 layers) + Full Attention (8 layers) - **Parameters**: ~0.8B (language model only, no vision encoder) - **Dtype**: bfloat16 ## How to use with vLLM ```python from vllm import LLM llm = LLM(model="codecho/Qwen3.5-0.8B-text-only", trust_remote_code=True) ```