richardmfan commited on
Commit
754eb62
·
verified ·
1 Parent(s): 4cb603d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -17
README.md CHANGED
@@ -59,23 +59,23 @@ vllm serve LLM360/K2-V2-Instruct --tensor-parallel-size 8 --port 8000 --revision
59
  K2-V2-Instruct uses `reasoning_effort="low"|"medium"|"high"` in the chat template to determine reasoning effort. If you cannot use `tokenizer.apply_chat_template`, you may also pass in these arguments using `extra_body` and `chat_template_kwargs`:
60
 
61
  ```
62
- curl -X POST "http://localhost:8000/v1/chat/completions" \
63
- -H "Content-Type: application/json" \
64
- -H "Authorization: Bearer key" \
65
- -d $'{
66
- "model": "LLM360/K2-V2-Instruct",
67
- "messages": [
68
- {
69
- "role": "user",
70
- "content": "Explain why the derivative of sin(x) is cos(x)."
71
- }
72
- ],
73
- "extra_body": {
74
- "chat_template_kwargs": {
75
- "reasoning_effort": "high"
76
- }
77
- }
78
- }'
79
  ```
80
 
81
  ---
 
59
  K2-V2-Instruct uses `reasoning_effort="low"|"medium"|"high"` in the chat template to determine reasoning effort. If you cannot use `tokenizer.apply_chat_template`, you may also pass in these arguments using `extra_body` and `chat_template_kwargs`:
60
 
61
  ```
62
+ from openai import OpenAI
63
+
64
+ client = OpenAI(
65
+ base_url="http://localhost:8000/v1",
66
+ api_key="key"
67
+ )
68
+
69
+ completion = client.chat.completions.create(
70
+ model="LLM360/K2-V2-Instruct",
71
+ messages = [
72
+ {"role": "system", "content": "You are K2, a helpful assistant created by Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) Institute of Foundation Models (IFM)."},
73
+ {"role": "user", "content": "Explain why the derivative of sin(x) is cos(x)."}
74
+ ],
75
+ extra_body={
76
+ "chat_template_kwargs": {"reasoning_effort": "high"},
77
+ },
78
+ )
79
  ```
80
 
81
  ---