Update README.md
Browse files
README.md
CHANGED
|
@@ -183,6 +183,8 @@ outputs = model.generate(inputs, max_new_tokens=256)
|
|
| 183 |
print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
|
| 184 |
```
|
| 185 |
|
|
|
|
|
|
|
| 186 |
## Citation
|
| 187 |
|
| 188 |
```bibtex
|
|
|
|
| 183 |
print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
|
| 184 |
```
|
| 185 |
|
| 186 |
+
**Note**: vLLM is the recommended engine for deployment, as SGLang currently lacks support for MoE models with tied embeddings (see [PR #20127](https://github.com/sgl-project/sglang/pull/20127)). If SGLang is required for your workflow, please use the specific build at commit e5f48b32abff027d859a43b7d5ba3aece04471c7.
|
| 187 |
+
|
| 188 |
## Citation
|
| 189 |
|
| 190 |
```bibtex
|