add python codes to code blocks

#2
.eval_results/gpqa.yaml DELETED
@@ -1,7 +0,0 @@
1
- - dataset:
2
- id: Idavidrein/gpqa
3
- task_id: diamond
4
- value: 76.3
5
- source:
6
- url: https://huggingface.co/arcee-ai/Trinity-Large-Thinking
7
- name: Model Card
 
 
 
 
 
 
 
 
.eval_results/mmlu-pro.yaml DELETED
@@ -1,7 +0,0 @@
1
- - dataset:
2
- id: TIGER-Lab/MMLU-Pro
3
- task_id: mmlu_pro
4
- value: 83.4
5
- source:
6
- url: https://huggingface.co/arcee-ai/Trinity-Large-Thinking
7
- name: Model Card
 
 
 
 
 
 
 
 
.eval_results/swe-bench_verified.yaml DELETED
@@ -1,7 +0,0 @@
1
- - dataset:
2
- id: SWE-bench/SWE-bench_Verified
3
- task_id: swe_bench_%_resolved
4
- value: 63.2
5
- source:
6
- url: https://huggingface.co/arcee-ai/Trinity-Large-Thinking
7
- name: Model Card
 
 
 
 
 
 
 
 
.gitattributes CHANGED
@@ -34,5 +34,3 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
37
- All[[:space:]]charts.jpg filter=lfs diff=lfs merge=lfs -text
38
- All_charts.jpg filter=lfs diff=lfs merge=lfs -text
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  tokenizer.json filter=lfs diff=lfs merge=lfs -text
 
 
All_charts.jpg DELETED

Git LFS Details

  • SHA256: 7780a4bc991ece46293e7ba4f5209f992efcc7052c7cd949e1676cb970d2007a
  • Pointer size: 131 Bytes
  • Size of remote file: 140 kB
LICENSE DELETED
@@ -1,49 +0,0 @@
1
- OpenMDW License Agreement, version 1.1 (OpenMDW-1.1)
2
-
3
- By exercising rights granted to you under this agreement, you accept and agree
4
- to its terms.
5
-
6
- As used in this agreement, "Model Materials" means the materials provided to
7
- you under this agreement, consisting of: (1) one or more machine learning
8
- models (including architecture and parameters); and (2) all related artifacts
9
- (including associated data, documentation and software) that are provided to
10
- you hereunder.
11
-
12
- Subject to your compliance with this agreement, permission is hereby granted,
13
- free of charge, to deal in the Model Materials without restriction, including
14
- under all copyright, patent, database, and trade secret rights included or
15
- embodied therein.
16
-
17
- If you distribute any portion of the Model Materials, you shall retain in your
18
- distribution (1) a copy of this agreement, and (2) all copyright notices and
19
- other notices of origin included in the Model Materials that are applicable to
20
- your distribution.
21
-
22
- If you file, maintain, or voluntarily participate in a lawsuit against any
23
- person or entity asserting that the Model Materials directly or indirectly
24
- infringe any patent or copyright, then all rights and grants made to you
25
- hereunder are terminated, unless that lawsuit was in response to a
26
- corresponding lawsuit first brought against you.
27
-
28
- This agreement does not impose any restrictions or obligations with respect to
29
- any use, modification, or sharing of any outputs generated by using the Model
30
- Materials.
31
-
32
- THE MODEL MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
33
- OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
34
- FITNESS FOR A PARTICULAR PURPOSE, TITLE, NONINFRINGEMENT, ACCURACY, OR THE
35
- ABSENCE OF LATENT OR OTHER DEFECTS OR ERRORS, WHETHER OR NOT DISCOVERABLE, ALL
36
- TO THE GREATEST EXTENT PERMISSIBLE UNDER APPLICABLE LAW.
37
-
38
- YOU ARE SOLELY RESPONSIBLE FOR (1) CLEARING RIGHTS OF OTHER PERSONS THAT MAY
39
- APPLY TO THE MODEL MATERIALS OR ANY USE THEREOF, INCLUDING WITHOUT LIMITATION
40
- ANY PERSON'S COPYRIGHTS OR OTHER RIGHTS INCLUDED OR EMBODIED IN THE MODEL
41
- MATERIALS; (2) OBTAINING ANY NECESSARY CONSENTS, PERMISSIONS OR OTHER RIGHTS
42
- REQUIRED FOR ANY USE OF THE MODEL MATERIALS; OR (3) PERFORMING ANY DUE
43
- DILIGENCE OR UNDERTAKING ANY OTHER INVESTIGATIONS INTO THE MODEL MATERIALS OR
44
- ANYTHING INCORPORATED OR EMBODIED THEREIN.
45
-
46
- IN NO EVENT SHALL THE PROVIDERS OF THE MODEL MATERIALS BE LIABLE FOR ANY CLAIM,
47
- DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
48
- OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE MODEL MATERIALS, THE
49
- USE THEREOF OR OTHER DEALINGS THEREIN.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- license: other
3
  language:
4
  - en
5
  - es
@@ -22,8 +22,6 @@ tags:
22
  - agentic
23
  - tool-calling
24
  - thinking
25
- license_link: LICENSE
26
- license_name: openmdw-1.1
27
  ---
28
  <!-- markdownlint-disable first-line-h1 -->
29
  <!-- markdownlint-disable html -->
@@ -86,7 +84,6 @@ Trinity-Large-Thinking shares the same sparse MoE architecture as Trinity-Large-
86
  | Architecture | Sparse MoE (AfmoeForCausalLM) |
87
 
88
  ## Benchmarks
89
- ![Benchmark charts](https://huggingface.co/arcee-ai/Trinity-Large-Thinking/resolve/main/All_charts.jpg)
90
 
91
  | Benchmark | Trinity-Large-Thinking | Opus-4.6 | GLM-5 | MiniMax-M2.7 | Kimi-K2.5 |
92
  |---|---:|---:|---:|---:|---:|
@@ -109,37 +106,29 @@ Trinity-Large-Thinking produces reasoning traces inside `<think>...</think>` blo
109
  This means:
110
 
111
  1. **Multi-turn conversations**: When building chat applications, include the full assistant response (thinking + answer) in the conversation history for subsequent turns.
112
- 2. **Agentic loops**: When using Trinity-Large-Thinking as the backbone of an agent (OpenClaw, Hermes Agent, or custom), ensure your tool-calling loop preserves reasoning in the message history between steps.
113
  3. **Context window management**: The 512k extended context window accommodates long reasoning chains across many agentic steps. If you must truncate history, prefer removing older turns entirely rather than stripping thinking tokens from recent turns.
114
 
115
  ### How thinking works
116
 
117
- The model reasons internally before producing its response. When served via vLLM, the reasoning is separated into a dedicated field in the API response:
118
-
119
- ```json
120
- // API response structure
121
- {
122
- "message": {
123
- "role": "assistant",
124
- "reasoning_content": "The user wants flight information. I need to determine the date for next Tuesday, search for flights SFO → JFK, and filter by price < $300.",
125
- "content": "\n",
126
- "tool_calls": [{
127
- "function": {
128
- "name": "search_flights",
129
- "arguments": "{\"origin\": \"SFO\", \"destination\": \"JFK\", \"date\": \"2026-04-07\", \"max_price\": 300}"
 
130
  }
131
- }]
132
- }
133
- }
134
- ```
135
-
136
- ### Preserving reasoning in multi-turn conversations
137
-
138
- When building multi-turn agentic loops, you **must** pass `reasoning_content` back on assistant messages in subsequent requests. The chat template reads this field and re-wraps it in `<think>...</think>` tags during tokenization, maintaining the model's chain-of-thought across turns.
139
 
140
- **What happens if reasoning is omitted entirely?** The model can lose prior chain-of-thought context. On simple tasks this may work fine, but on complex multi-step agentic tasks, the model can produce malformed tool calls (e.g., tool call XML appearing inside the reasoning field instead of as structured `tool_calls`). For best results, always preserve `reasoning_content` and use `""` instead of `null` for content on tool-call turns.
141
-
142
- For implementation details, pitfalls (`reasoning` vs `reasoning_content`), and Python/TypeScript examples, see [Reasoning Traces](https://docs.arcee.ai/capabilities/reasoning-traces).
143
 
144
  ## Training Configuration
145
 
@@ -171,21 +160,18 @@ For implementation details, pitfalls (`reasoning` vs `reasoning_content`), and P
171
 
172
  Supported in vLLM 0.11.1+. For agentic use with both reasoning and tool calling:
173
 
174
- ```bash
175
- vllm serve arcee-ai/Trinity-Large-Thinking \
176
- --dtype bfloat16 \
177
- --reasoning-parser deepseek_r1 \
178
- --enable-auto-tool-choice \
179
- --tool-call-parser qwen3_coder
180
- ```
181
- **Recommended inference settings**: `temperature=0.45–0.6`, `top_p=0.95`, `top_k=50`
182
 
183
  This configuration:
184
  - `--reasoning-parser deepseek_r1` — Parses `<think>...</think>` reasoning blocks and exposes them via the `reasoning_content` field in the API response
185
  - `--tool-call-parser qwen3_coder` — Parses structured tool calls from the model output into the OpenAI-compatible `tool_calls` array
186
 
187
-
188
- #### Single-turn example
189
 
190
  ```python
191
  from openai import OpenAI
@@ -197,18 +183,22 @@ response = client.chat.completions.create(
197
  messages=[
198
  {"role": "user", "content": "What's the weather like in Paris?"}
199
  ],
200
- tools=[{
201
- "type": "function",
202
- "function": {
203
- "name": "get_weather",
204
- "description": "Get current weather for a location",
205
- "parameters": {
206
- "type": "object",
207
- "properties": {"location": {"type": "string"}},
208
- "required": ["location"]
 
 
 
 
209
  }
210
  }
211
- }],
212
  )
213
 
214
  # Access reasoning (thinking) content
@@ -219,87 +209,7 @@ content = response.choices[0].message.content
219
  tool_calls = response.choices[0].message.tool_calls
220
  ```
221
 
222
- #### Multi-turn agentic loop example
223
-
224
- The key pattern: after each turn, append the **full** assistant response (including reasoning) back to the message history, then append tool results, and send the updated history for the next turn.
225
-
226
- ```python
227
- import json
228
- from openai import OpenAI
229
-
230
- client = OpenAI(api_key="EMPTY", base_url="http://localhost:8000/v1")
231
- MODEL = "arcee-ai/Trinity-Large-Thinking"
232
-
233
- tools = [
234
- {"type": "function", "function": {
235
- "name": "get_customer_by_email",
236
- "description": "Look up a customer by email.",
237
- "parameters": {"type": "object", "properties": {"email": {"type": "string"}}, "required": ["email"]}
238
- }},
239
- {"type": "function", "function": {
240
- "name": "cancel_subscription",
241
- "description": "Cancel a subscription. Requires customer_id.",
242
- "parameters": {"type": "object", "properties": {"customer_id": {"type": "string"}, "reason": {"type": "string"}}, "required": ["customer_id"]}
243
- }}
244
- ]
245
-
246
- def execute_tool(name, arguments):
247
- """Simulate tool execution — replace with real implementations."""
248
- args = json.loads(arguments)
249
- if name == "get_customer_by_email":
250
- return json.dumps({"customer_id": "C2001", "name": "Jane Doe", "plan": "Premium", "status": "active"})
251
- elif name == "cancel_subscription":
252
- return json.dumps({"success": True, "message": f"Subscription cancelled for {args['customer_id']}"})
253
-
254
- messages = [
255
- {"role": "system", "content": "You are a helpful customer service agent."},
256
- {"role": "user", "content": "I want to cancel my subscription. My email is jane@example.com"}
257
- ]
258
-
259
- # Agent loop
260
- while True:
261
- response = client.chat.completions.create(
262
- model=MODEL, messages=messages, tools=tools,
263
- tool_choice="auto", temperature=0, max_tokens=1000
264
- )
265
- msg = response.choices[0].message
266
-
267
- # Build assistant message — PRESERVE reasoning_content
268
- assistant_msg = {"role": "assistant", "content": msg.content}
269
- if msg.reasoning_content:
270
- assistant_msg["reasoning_content"] = msg.reasoning_content # ← critical for multi-turn
271
- if msg.tool_calls:
272
- assistant_msg["tool_calls"] = [
273
- {"id": tc.id, "type": "function", "function": {"name": tc.function.name, "arguments": tc.function.arguments}}
274
- for tc in msg.tool_calls
275
- ]
276
- messages.append(assistant_msg)
277
-
278
- # If no tool calls, model gave its final response — done
279
- if not msg.tool_calls:
280
- print(f"Final response: {msg.content}")
281
- break
282
-
283
- # Execute tool calls and append results
284
- for tc in msg.tool_calls:
285
- result = execute_tool(tc.function.name, tc.function.arguments)
286
- print(f" Tool: {tc.function.name}({tc.function.arguments}) → {result}")
287
- messages.append({"role": "tool", "tool_call_id": tc.id, "content": result})
288
- ```
289
-
290
- Expected output:
291
- ```
292
- Tool: get_customer_by_email({"email": "jane@example.com"}) → {"customer_id": "C2001", ...}
293
- Tool: cancel_subscription({"customer_id": "C2001", ...}) → {"success": true, ...}
294
- Final response: Your subscription has been cancelled successfully.
295
- ```
296
-
297
- The critical line is:
298
- ```python
299
- assistant_msg["reasoning_content"] = msg.reasoning_content # ← pass reasoning_content back
300
- ```
301
-
302
- The chat template re-wraps it in `<think>...</think>` tags automatically. See [Reasoning Traces](https://docs.arcee.ai/capabilities/reasoning-traces) for full details.
303
 
304
  ### Transformers
305
 
@@ -343,32 +253,20 @@ print(response)
343
 
344
  ### API
345
 
346
- #### OpenRouter
347
-
348
- Available on [OpenRouter](https://openrouter.ai/) with full reasoning and tool calling support:
349
-
350
- ```bash
351
- curl -X POST "https://openrouter.ai/v1/chat/completions" \
352
- -H "Authorization: Bearer $OPENROUTER_API_KEY" \
353
- -H "Content-Type: application/json" \
354
- -d '{
355
- "model": "arcee-ai/trinity-large-thinking",
356
- "messages": [
357
- {
358
- "role": "user",
359
- "content": "What are some fun things to do in New York?"
360
- }
361
- ]
362
- }'
363
- ```
364
-
365
- **Multi-turn with OpenRouter**: OpenRouter returns reasoning in a `reasoning_details` object (their unified reasoning shape). For multi-turn conversations, pass `reasoning_details` back as-is on assistant messages in subsequent requests — OpenRouter handles model-specific upstream translation (for Trinity, this is sent as `reasoning_content` on assistant turns upstream). For debugging, enable echo to inspect the upstream API call:
366
-
367
- ```json
368
- {"debug": {"echo_upstream_body": true}}
369
- ```
370
-
371
- See [OpenRouter debugging docs](https://openrouter.ai/docs/api/reference/errors-and-debugging#debugging) for details.
372
 
373
  ## Agentic Use Cases
374
 
@@ -378,8 +276,6 @@ Trinity-Large-Thinking is optimized for deployment as the reasoning backbone of
378
 
379
  Trinity-Large-Thinking works as a drop-in brain for OpenClaw agents. Its native tool-calling format is compatible with OpenClaw's execution loop, and the extended reasoning enables reliable multi-step task completion — from email triage to code generation to meeting scheduling. Our 91.9% PinchBench score reflects real-world OpenClaw task performance.
380
 
381
- **Deploying for OpenClaw users**: OpenClaw preserves full assistant turns across steps. Ensure `reasoning_content` is forwarded on assistant messages in subsequent turns, and keep `content` non-null (empty string `""` is fine on tool-call turns). See [Reasoning Traces](https://docs.arcee.ai/capabilities/reasoning-traces) for full integration details.
382
-
383
  ### Hermes Agent
384
 
385
  Compatible with the Hermes Agent framework from Nous Research. Trinity-Large-Thinking's reasoning traces pair naturally with Hermes's skill-learning loop — the model's explicit chain-of-thought makes skill extraction more reliable, and its strong tool-calling capabilities integrate directly via the Hermes tool-use protocol.
@@ -389,31 +285,27 @@ Compatible with the Hermes Agent framework from Nous Research. Trinity-Large-Thi
389
  For custom implementations, the key integration pattern is:
390
 
391
  1. Send the user message with tool definitions
392
- 2. Receive the response with `reasoning_content` + `content` + `tool_calls`
393
  3. Execute the tool calls
394
- 4. Append the **full** assistant response (reasoning_content + content + tool calls) and tool results to the message history
395
  5. Send the updated history back for the next step
396
  6. Repeat until the model produces a final response without tool calls
397
 
398
- > **Important**: Step 4 must include `reasoning_content` on the assistant message. The chat template reads this field and re-wraps it in `<think>...</think>` tags during tokenization. Omitting it degrades multi-step performance — see [Reasoning Traces](https://docs.arcee.ai/capabilities/reasoning-traces) for full details.
399
-
400
  ## License
401
 
402
- Trinity-Large-Thinking is released under the OpenMDW License, version 1.1 (OpenMDW-1.1).
403
 
404
  ## Citation
405
 
406
  If you use this model, please cite:
407
 
408
- ```bibtex
409
- @misc{singh2026arceetrinity,
410
- title = {Arcee Trinity Large Technical Report},
411
- author = {Varun Singh and Lucas Krauss and Sami Jaghouar and Matej Sirovatka and Charles Goddard and Fares Obied and Jack Min Ong and Jannik Straube and Fern and Aria Harley and Conner Stewart and Colin Kealty and Maziyar Panahi and Simon Kirsten and Anushka Deshpande and Anneketh Vij and Arthur Bresnu and Pranav Veldurthi and Raghav Ravishankar and Hardik Bishnoi and DatologyAI Team and Arcee AI Team and Prime Intellect Team and Mark McQuade and Johannes Hagemann and Lucas Atkins},
412
- year = {2026},
413
- eprint = {2602.17004},
414
- archivePrefix= {arXiv},
415
- primaryClass = {cs.LG},
416
- doi = {10.48550/arXiv.2602.17004},
417
- url = {https://arxiv.org/abs/2602.17004}
418
- }
419
- ```
 
1
  ---
2
+ license: apache-2.0
3
  language:
4
  - en
5
  - es
 
22
  - agentic
23
  - tool-calling
24
  - thinking
 
 
25
  ---
26
  <!-- markdownlint-disable first-line-h1 -->
27
  <!-- markdownlint-disable html -->
 
84
  | Architecture | Sparse MoE (AfmoeForCausalLM) |
85
 
86
  ## Benchmarks
 
87
 
88
  | Benchmark | Trinity-Large-Thinking | Opus-4.6 | GLM-5 | MiniMax-M2.7 | Kimi-K2.5 |
89
  |---|---:|---:|---:|---:|---:|
 
106
  This means:
107
 
108
  1. **Multi-turn conversations**: When building chat applications, include the full assistant response (thinking + answer) in the conversation history for subsequent turns.
109
+ 2. **Agentic loops**: When using Trinity-Large-Thinking as the backbone of an agent (OpenClaw, Hermes Agent, or custom), ensure your tool-calling loop preserves `<think>` blocks in the message history between steps.
110
  3. **Context window management**: The 512k extended context window accommodates long reasoning chains across many agentic steps. If you must truncate history, prefer removing older turns entirely rather than stripping thinking tokens from recent turns.
111
 
112
  ### How thinking works
113
 
114
+ The model reasons internally before producing its response. When served via vLLM, the reasoning is separated into a dedicated `reasoning_content` field in the API response:
115
+
116
+ // API response structure
117
+ {
118
+ "message": {
119
+ "role": "assistant",
120
+ "reasoning_content": "The user wants flight information. I need to determine the date for next Tuesday, search for flights SFO → JFK, and filter by price < $300.",
121
+ "content": "\n",
122
+ "tool_calls": [{
123
+ "function": {
124
+ "name": "search_flights",
125
+ "arguments": "{\"origin\": \"SFO\", \"destination\": \"JFK\", \"date\": \"2026-04-07\", \"max_price\": 300}"
126
+ }
127
+ }]
128
  }
129
+ }
 
 
 
 
 
 
 
130
 
131
+ When building multi-turn agentic loops, include the `reasoning_content` back in the conversation history (re-wrapped in `<think>...</think>` tags within the assistant message) so the model retains its prior reasoning chain.
 
 
132
 
133
  ## Training Configuration
134
 
 
160
 
161
  Supported in vLLM 0.11.1+. For agentic use with both reasoning and tool calling:
162
 
163
+ vllm serve arcee-ai/Trinity-Large-Thinking \
164
+ --dtype bfloat16 \
165
+ --enable-reasoning \
166
+ --reasoning-parser deepseek_r1 \
167
+ --enable-auto-tool-choice \
168
+ --tool-call-parser qwen3_coder
 
 
169
 
170
  This configuration:
171
  - `--reasoning-parser deepseek_r1` — Parses `<think>...</think>` reasoning blocks and exposes them via the `reasoning_content` field in the API response
172
  - `--tool-call-parser qwen3_coder` — Parses structured tool calls from the model output into the OpenAI-compatible `tool_calls` array
173
 
174
+ **Extracting reasoning content from the API response:**
 
175
 
176
  ```python
177
  from openai import OpenAI
 
183
  messages=[
184
  {"role": "user", "content": "What's the weather like in Paris?"}
185
  ],
186
+ tools=[ # your tool definitions here
187
+ {
188
+ "type": "function",
189
+ "function": {
190
+ "name": "get_weather",
191
+ "description": "Get current weather for a location",
192
+ "parameters": {
193
+ "type": "object",
194
+ "properties": {
195
+ "location": {"type": "string"}
196
+ },
197
+ "required": ["location"]
198
+ }
199
  }
200
  }
201
+ ],
202
  )
203
 
204
  # Access reasoning (thinking) content
 
209
  tool_calls = response.choices[0].message.tool_calls
210
  ```
211
 
212
+ **Note on thinking-in-context with vLLM**: When building multi-turn agentic loops, include both `reasoning_content` and `content` in the conversation history you send back to the model. The reasoning content should be re-wrapped in `<think>...</think>` tags within the assistant message.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
213
 
214
  ### Transformers
215
 
 
253
 
254
  ### API
255
 
256
+ Available on OpenRouter:
257
+
258
+ curl -X POST "https://openrouter.ai/v1/chat/completions" \
259
+ -H "Authorization: Bearer $OPENROUTER_API_KEY" \
260
+ -H "Content-Type: application/json" \
261
+ -d '{
262
+ "model": "arcee-ai/trinity-large-thinking",
263
+ "messages": [
264
+ {
265
+ "role": "user",
266
+ "content": "What are some fun things to do in New York?"
267
+ }
268
+ ]
269
+ }'
 
 
 
 
 
 
 
 
 
 
 
 
270
 
271
  ## Agentic Use Cases
272
 
 
276
 
277
  Trinity-Large-Thinking works as a drop-in brain for OpenClaw agents. Its native tool-calling format is compatible with OpenClaw's execution loop, and the extended reasoning enables reliable multi-step task completion — from email triage to code generation to meeting scheduling. Our 91.9% PinchBench score reflects real-world OpenClaw task performance.
278
 
 
 
279
  ### Hermes Agent
280
 
281
  Compatible with the Hermes Agent framework from Nous Research. Trinity-Large-Thinking's reasoning traces pair naturally with Hermes's skill-learning loop — the model's explicit chain-of-thought makes skill extraction more reliable, and its strong tool-calling capabilities integrate directly via the Hermes tool-use protocol.
 
285
  For custom implementations, the key integration pattern is:
286
 
287
  1. Send the user message with tool definitions
288
+ 2. Receive the response with `<think>` reasoning + tool calls
289
  3. Execute the tool calls
290
+ 4. Append the **full** assistant response (thinking + content + tool calls) and tool results to the message history
291
  5. Send the updated history back for the next step
292
  6. Repeat until the model produces a final response without tool calls
293
 
 
 
294
  ## License
295
 
296
+ Trinity-Large-Thinking is released under the Apache License, Version 2.0.
297
 
298
  ## Citation
299
 
300
  If you use this model, please cite:
301
 
302
+ @misc{singh2026arceetrinity,
303
+ title = {Arcee Trinity Large Technical Report},
304
+ author = {Varun Singh and Lucas Krauss and Sami Jaghouar and Matej Sirovatka and Charles Goddard and Fares Obied and Jack Min Ong and Jannik Straube and Fern and Aria Harley and Conner Stewart and Colin Kealty and Maziyar Panahi and Simon Kirsten and Anushka Deshpande and Anneketh Vij and Arthur Bresnu and Pranav Veldurthi and Raghav Ravishankar and Hardik Bishnoi and DatologyAI Team and Arcee AI Team and Prime Intellect Team and Mark McQuade and Johannes Hagemann and Lucas Atkins},
305
+ year = {2026},
306
+ eprint = {2602.17004},
307
+ archivePrefix= {arXiv},
308
+ primaryClass = {cs.LG},
309
+ doi = {10.48550/arXiv.2602.17004},
310
+ url = {https://arxiv.org/abs/2602.17004}
311
+ }
 
 
chat_template.jinja CHANGED
@@ -18,8 +18,7 @@
18
  {%- endif %}
19
  {{- '<tool_call>\n<function=' + (tool_call.name | default('') | string) + '>\n' }}
20
  {%- if tool_call.arguments is defined and tool_call.arguments is mapping %}
21
- {%- for args_name in tool_call.arguments %}
22
- {%- set args_value = tool_call.arguments[args_name] %}
23
  {{- '<parameter=' + (args_name | string) + '>\n' }}
24
  {%- if args_value is mapping or (args_value is sequence and args_value is not string) %}
25
  {{- args_value | tojson | safe }}
@@ -63,9 +62,8 @@
63
  {{- '\n<description>' ~ (tool.description | string | trim) ~ '</description>' }}
64
  {%- endif %}
65
  {{- '\n<parameters>' }}
66
- {%- if tool.parameters is defined and tool.parameters is mapping and 'properties' in tool.parameters and tool.parameters['properties'] is mapping %}
67
- {%- for param_name in tool.parameters['properties'] %}
68
- {%- set param_fields = tool.parameters['properties'][param_name] %}
69
  {{- '\n<parameter>\n<name>' ~ (param_name | string) ~ '</name>' }}
70
  {%- if param_fields is mapping and param_fields.type is defined and param_fields.type is not none %}
71
  {{- '\n<type>' ~ (param_fields.type | string) ~ '</type>' }}
@@ -158,4 +156,4 @@
158
 
159
  {%- if add_generation_prompt %}
160
  {{- '<|im_start|>assistant\n<think>' }}
161
- {%- endif %}
 
18
  {%- endif %}
19
  {{- '<tool_call>\n<function=' + (tool_call.name | default('') | string) + '>\n' }}
20
  {%- if tool_call.arguments is defined and tool_call.arguments is mapping %}
21
+ {%- for args_name, args_value in tool_call.arguments.items() %}
 
22
  {{- '<parameter=' + (args_name | string) + '>\n' }}
23
  {%- if args_value is mapping or (args_value is sequence and args_value is not string) %}
24
  {{- args_value | tojson | safe }}
 
62
  {{- '\n<description>' ~ (tool.description | string | trim) ~ '</description>' }}
63
  {%- endif %}
64
  {{- '\n<parameters>' }}
65
+ {%- if tool.parameters is defined and tool.parameters is mapping and tool.parameters.properties is defined and tool.parameters.properties is mapping %}
66
+ {%- for param_name, param_fields in tool.parameters.properties.items() %}
 
67
  {{- '\n<parameter>\n<name>' ~ (param_name | string) ~ '</name>' }}
68
  {%- if param_fields is mapping and param_fields.type is defined and param_fields.type is not none %}
69
  {{- '\n<type>' ~ (param_fields.type | string) ~ '</type>' }}
 
156
 
157
  {%- if add_generation_prompt %}
158
  {{- '<|im_start|>assistant\n<think>' }}
159
+ {%- endif %}