Add vLLM fork link for MLA detection support
Browse files
README.md
CHANGED
|
@@ -51,15 +51,23 @@ outputs = llm.generate(["Hello, world!"], SamplingParams(max_tokens=100))
|
|
| 51 |
print(outputs[0].outputs[0].text)
|
| 52 |
```
|
| 53 |
|
| 54 |
-
### vLLM
|
| 55 |
|
| 56 |
-
Until upstream
|
| 57 |
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 63 |
|
| 64 |
## License
|
| 65 |
|
|
|
|
| 51 |
print(outputs[0].outputs[0].text)
|
| 52 |
```
|
| 53 |
|
| 54 |
+
### vLLM Fork Required
|
| 55 |
|
| 56 |
+
Until upstream vLLM adds MLA detection for `glm4_moe_lite`, use our fork:
|
| 57 |
|
| 58 |
+
```bash
|
| 59 |
+
pip install git+https://github.com/marksverdhei/vllm.git@fix/glm4-moe-mla-detection
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
+
Or install from source:
|
| 63 |
+
```bash
|
| 64 |
+
git clone https://github.com/marksverdhei/vllm.git
|
| 65 |
+
cd vllm
|
| 66 |
+
git checkout fix/glm4-moe-mla-detection
|
| 67 |
+
pip install -e .
|
| 68 |
+
```
|
| 69 |
+
|
| 70 |
+
**Fork**: [marksverdhei/vllm](https://github.com/marksverdhei/vllm/tree/fix/glm4-moe-mla-detection)
|
| 71 |
|
| 72 |
## License
|
| 73 |
|