marksverdhei commited on
Commit
5d2df64
·
verified ·
1 Parent(s): 60a4777

Add vLLM fork link for MLA detection support

Browse files
Files changed (1) hide show
  1. README.md +15 -7
README.md CHANGED
@@ -51,15 +51,23 @@ outputs = llm.generate(["Hello, world!"], SamplingParams(max_tokens=100))
51
  print(outputs[0].outputs[0].text)
52
  ```
53
 
54
- ### vLLM Patches Required
55
 
56
- Until upstream support is added, you may need to patch vLLM:
57
 
58
- 1. Add `glm4_moe_lite` to MLA detection in `vllm/config/model.py`
59
- 2. Add registry mapping in `vllm/model_executor/models/registry.py`:
60
- ```python
61
- "Glm4MoeLiteForCausalLM": ("deepseek_v2", "DeepseekV2ForCausalLM"),
62
- ```
 
 
 
 
 
 
 
 
63
 
64
  ## License
65
 
 
51
  print(outputs[0].outputs[0].text)
52
  ```
53
 
54
+ ### vLLM Fork Required
55
 
56
+ Until upstream vLLM adds MLA detection for `glm4_moe_lite`, use our fork:
57
 
58
+ ```bash
59
+ pip install git+https://github.com/marksverdhei/vllm.git@fix/glm4-moe-mla-detection
60
+ ```
61
+
62
+ Or install from source:
63
+ ```bash
64
+ git clone https://github.com/marksverdhei/vllm.git
65
+ cd vllm
66
+ git checkout fix/glm4-moe-mla-detection
67
+ pip install -e .
68
+ ```
69
+
70
+ **Fork**: [marksverdhei/vllm](https://github.com/marksverdhei/vllm/tree/fix/glm4-moe-mla-detection)
71
 
72
  ## License
73