File size: 580 Bytes
410d1ab
 
 
 
 
 
 
 
 
 
 
 
1a4b1e4
410d1ab
1a4b1e4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
---
license: other
tags:
- glm
- debug
- synthetic
---

# GLM-5.2 Debug

Tiny randomly initialized GLM-5.2-shaped `glm_moe_dsa` model for fast trainer/kernel iteration.

Preserves the fused DSA dimensions needed by FlashMLA (`kv_lora_rank=512`, `qk_rope_head_dim=64`, `v_head_dim=512`), plus 64 attention heads, DSA IndexShare, MoE/shared experts, and `model_type=glm_moe_dsa`.

It intentionally uses 8 layers, tiny hidden/MLP/MoE sizes, `vocab_size=2048`, and random weights. Use synthetic token IDs in `[0, 2047]`; this is not intended for natural-language quality or sampling.