Update README.md
Browse files
README.md
CHANGED
|
@@ -173,6 +173,8 @@ More usage can be found [here](https://docs.sglang.ai/basic_usage/send_request.h
|
|
| 173 |
|
| 174 |
### vLLM
|
| 175 |
|
|
|
|
|
|
|
| 176 |
#### Environment Preparation
|
| 177 |
|
| 178 |
```bash
|
|
@@ -207,7 +209,6 @@ To handle long context in vLLM using YaRN, we need to follow these two steps:
|
|
| 207 |
```
|
| 208 |
2. Use an additional parameter `--max-model-len` to specify the desired maximum context length when starting the vLLM service.
|
| 209 |
|
| 210 |
-
For detailed guidance, please refer to the vLLM [`instructions`](https://docs.vllm.ai/en/latest/).
|
| 211 |
|
| 212 |
## Finetuning
|
| 213 |
|
|
|
|
| 173 |
|
| 174 |
### vLLM
|
| 175 |
|
| 176 |
+
For latest guidance, please refer to the vLLM [`instructions`](https://docs.vllm.ai/projects/recipes/en/latest/inclusionAI/Ring-1T-FP8.html).
|
| 177 |
+
|
| 178 |
#### Environment Preparation
|
| 179 |
|
| 180 |
```bash
|
|
|
|
| 209 |
```
|
| 210 |
2. Use an additional parameter `--max-model-len` to specify the desired maximum context length when starting the vLLM service.
|
| 211 |
|
|
|
|
| 212 |
|
| 213 |
## Finetuning
|
| 214 |
|