Update README.md
Browse files
README.md
CHANGED
|
@@ -24,11 +24,6 @@ Need to install vllm nightly to get some recent changes:
|
|
| 24 |
```
|
| 25 |
pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
|
| 26 |
```
|
| 27 |
-
## Command Line
|
| 28 |
-
Then we can serve with the following command:
|
| 29 |
-
```
|
| 30 |
-
vllm serve pytorch/Phi-4-mini-instruct-int4wo-hqq --tokenizer microsoft/Phi-4-mini-instruct -O3
|
| 31 |
-
```
|
| 32 |
|
| 33 |
## Code Example
|
| 34 |
```
|
|
@@ -52,6 +47,13 @@ output = llm.chat(messages=messages, sampling_params=sampling_params)
|
|
| 52 |
print(output[0].outputs[0].text)
|
| 53 |
```
|
| 54 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
# Inference with Transformers
|
| 56 |
|
| 57 |
Install the required packages:
|
|
|
|
| 24 |
```
|
| 25 |
pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
|
| 26 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
|
| 28 |
## Code Example
|
| 29 |
```
|
|
|
|
| 47 |
print(output[0].outputs[0].text)
|
| 48 |
```
|
| 49 |
|
| 50 |
+
## Serving
|
| 51 |
+
Then we can serve with the following command:
|
| 52 |
+
```
|
| 53 |
+
vllm serve pytorch/Phi-4-mini-instruct-int4wo-hqq --tokenizer microsoft/Phi-4-mini-instruct -O3
|
| 54 |
+
```
|
| 55 |
+
|
| 56 |
+
|
| 57 |
# Inference with Transformers
|
| 58 |
|
| 59 |
Install the required packages:
|