inclusionAI
/

LLaDA2.1-flash

Text Generation

text_generation

Model card Files Files and versions

utdawn commited on Feb 9

Commit

c538cf1

·

verified ·

1 Parent(s): c54780e

Update README.md

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -411,6 +411,12 @@ Note: Low `threshold` may causes stuttering in trade-off for quick inference.
    We recommend using an output length of 16384 tokens for most scenarios.
 ---
 ## Deployment
 ### SGLang
 SGLang enables dLLM inference either through offline batching or by launching an HTTP server for online requests. You can start the SGLang dLLM using the following commands:

    We recommend using an output length of 16384 tokens for most scenarios.
 ---
+## 🤖ModelScope
+If you're in mainland China, we strongly recommend you to use our model from 🤖[ModelScope](https://modelscope.cn/models/inclusionAI/LLaDA2.1-flash)
+---
 ## Deployment
 ### SGLang
 SGLang enables dLLM inference either through offline batching or by launching an HTTP server for online requests. You can start the SGLang dLLM using the following commands: