Update README.md
Browse files
README.md
CHANGED
|
@@ -59,23 +59,23 @@ vllm serve LLM360/K2-V2-Instruct --tensor-parallel-size 8 --port 8000 --revision
|
|
| 59 |
K2-V2-Instruct uses `reasoning_effort="low"|"medium"|"high"` in the chat template to determine reasoning effort. If you cannot use `tokenizer.apply_chat_template`, you may also pass in these arguments using `extra_body` and `chat_template_kwargs`:
|
| 60 |
|
| 61 |
```
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
|
| 78 |
-
|
| 79 |
```
|
| 80 |
|
| 81 |
---
|
|
|
|
| 59 |
K2-V2-Instruct uses `reasoning_effort="low"|"medium"|"high"` in the chat template to determine reasoning effort. If you cannot use `tokenizer.apply_chat_template`, you may also pass in these arguments using `extra_body` and `chat_template_kwargs`:
|
| 60 |
|
| 61 |
```
|
| 62 |
+
from openai import OpenAI
|
| 63 |
+
|
| 64 |
+
client = OpenAI(
|
| 65 |
+
base_url="http://localhost:8000/v1",
|
| 66 |
+
api_key="key"
|
| 67 |
+
)
|
| 68 |
+
|
| 69 |
+
completion = client.chat.completions.create(
|
| 70 |
+
model="LLM360/K2-V2-Instruct",
|
| 71 |
+
messages = [
|
| 72 |
+
{"role": "system", "content": "You are K2, a helpful assistant created by Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) Institute of Foundation Models (IFM)."},
|
| 73 |
+
{"role": "user", "content": "Explain why the derivative of sin(x) is cos(x)."}
|
| 74 |
+
],
|
| 75 |
+
extra_body={
|
| 76 |
+
"chat_template_kwargs": {"reasoning_effort": "high"},
|
| 77 |
+
},
|
| 78 |
+
)
|
| 79 |
```
|
| 80 |
|
| 81 |
---
|