inclusionAI
/

Ring-1T-FP8

Text Generation

compressed-tensors

Model card Files Files and versions

zhanghanxiao commited on Oct 16, 2025

Commit

5082fe3

·

verified ·

1 Parent(s): 889c262

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -173,6 +173,8 @@ More usage can be found [here](https://docs.sglang.ai/basic_usage/send_request.h
 ### vLLM
 #### Environment Preparation
 ```bash
@@ -207,7 +209,6 @@ To handle long context in vLLM using YaRN, we need to follow these two steps:
 ```
 2. Use an additional parameter `--max-model-len` to specify the desired maximum context length when starting the vLLM service.
-For detailed guidance, please refer to the vLLM [`instructions`](https://docs.vllm.ai/en/latest/).
 ## Finetuning

 ### vLLM
+For latest guidance, please refer to the vLLM [`instructions`](https://docs.vllm.ai/projects/recipes/en/latest/inclusionAI/Ring-1T-FP8.html).
 #### Environment Preparation
 ```bash
 ```
 2. Use an additional parameter `--max-model-len` to specify the desired maximum context length when starting the vLLM service.
 ## Finetuning