Update README.md
Browse files
README.md
CHANGED
|
@@ -411,6 +411,12 @@ Note: Low `threshold` may causes stuttering in trade-off for quick inference.
|
|
| 411 |
We recommend using an output length of 16384 tokens for most scenarios.
|
| 412 |
|
| 413 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 414 |
## Deployment
|
| 415 |
### SGLang
|
| 416 |
SGLang enables dLLM inference either through offline batching or by launching an HTTP server for online requests. You can start the SGLang dLLM using the following commands:
|
|
|
|
| 411 |
We recommend using an output length of 16384 tokens for most scenarios.
|
| 412 |
|
| 413 |
---
|
| 414 |
+
|
| 415 |
+
## 🤖ModelScope
|
| 416 |
+
If you're in mainland China, we strongly recommend you to use our model from 🤖[ModelScope](https://modelscope.cn/models/inclusionAI/LLaDA2.1-flash)
|
| 417 |
+
|
| 418 |
+
---
|
| 419 |
+
|
| 420 |
## Deployment
|
| 421 |
### SGLang
|
| 422 |
SGLang enables dLLM inference either through offline batching or by launching an HTTP server for online requests. You can start the SGLang dLLM using the following commands:
|