Update README.md
Browse files
README.md
CHANGED
|
@@ -20,10 +20,6 @@ Qwen3.5 models are natively multimodal (VLM). Their HuggingFace checkpoints use
|
|
| 20 |
- Weight keys at `model.layers.*` (standard causal LM format, no `language_model.` prefix)
|
| 21 |
- Vision encoder and MTP weights removed
|
| 22 |
|
| 23 |
-
## Why this exists
|
| 24 |
-
|
| 25 |
-
When training frameworks like TRL save Qwen3.5 text-only checkpoints during GRPO/RL training, they produce this format. vLLM needs a public checkpoint in this format for CI testing of the `Qwen3_5ForCausalLM` code path. See [vllm-project/vllm#36275](https://github.com/vllm-project/vllm/issues/36275).
|
| 26 |
-
|
| 27 |
## Model structure
|
| 28 |
|
| 29 |
- **Architecture**: Hybrid GatedDeltaNet (24 layers) + Full Attention (8 layers)
|
|
|
|
| 20 |
- Weight keys at `model.layers.*` (standard causal LM format, no `language_model.` prefix)
|
| 21 |
- Vision encoder and MTP weights removed
|
| 22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
## Model structure
|
| 24 |
|
| 25 |
- **Architecture**: Hybrid GatedDeltaNet (24 layers) + Full Attention (8 layers)
|