LLM360
/

K2-V2-Instruct

Model card Files Files and versions

richardmfan commited on Dec 5, 2025

Commit

754eb62

·

verified ·

1 Parent(s): 4cb603d

Update README.md

Files changed (1) hide show

README.md +17 -17

README.md CHANGED Viewed

@@ -59,23 +59,23 @@ vllm serve LLM360/K2-V2-Instruct --tensor-parallel-size 8 --port 8000 --revision
 K2-V2-Instruct uses `reasoning_effort="low"|"medium"|"high"` in the chat template to determine reasoning effort. If you cannot use `tokenizer.apply_chat_template`, you may also pass in these arguments using `extra_body` and `chat_template_kwargs`:
 ```
-curl -X POST "http://localhost:8000/v1/chat/completions" \
-  -H "Content-Type: application/json" \
-  -H "Authorization: Bearer key" \
-  -d $'{
-  "model": "LLM360/K2-V2-Instruct",
-  "messages": [
-    {
-      "role": "user",
-      "content": "Explain why the derivative of sin(x) is cos(x)."
-    }
-  ],
-  "extra_body": {
-    "chat_template_kwargs": {
-      "reasoning_effort": "high"
-    }
-  }
-}'
 ```
 ---

 K2-V2-Instruct uses `reasoning_effort="low"|"medium"|"high"` in the chat template to determine reasoning effort. If you cannot use `tokenizer.apply_chat_template`, you may also pass in these arguments using `extra_body` and `chat_template_kwargs`:
 ```
+from openai import OpenAI
+client = OpenAI(
+    base_url="http://localhost:8000/v1",
+    api_key="key"
+)
+completion = client.chat.completions.create(
+    model="LLM360/K2-V2-Instruct",
+    messages = [
+        {"role": "system", "content": "You are K2, a helpful assistant created by Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) Institute of Foundation Models (IFM)."},
+        {"role": "user", "content": "Explain why the derivative of sin(x) is cos(x)."}
+    ],
+    extra_body={
+        "chat_template_kwargs": {"reasoning_effort": "high"},
+    },
+)
 ```
 ---