Instructions to use arcee-ai/Trinity-Large-Thinking with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use arcee-ai/Trinity-Large-Thinking with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("arcee-ai/Trinity-Large-Thinking", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -119,7 +119,7 @@ The model reasons internally before producing its response. When served via vLLM
|
|
| 119 |
{
|
| 120 |
"message": {
|
| 121 |
"role": "assistant",
|
| 122 |
-
"
|
| 123 |
"content": "\n",
|
| 124 |
"tool_calls": [{
|
| 125 |
"function": {
|
|
@@ -133,11 +133,11 @@ The model reasons internally before producing its response. When served via vLLM
|
|
| 133 |
|
| 134 |
### Preserving reasoning in multi-turn conversations
|
| 135 |
|
| 136 |
-
When building multi-turn agentic loops, you **must** pass
|
| 137 |
|
| 138 |
-
**
|
| 139 |
|
| 140 |
-
|
| 141 |
|
| 142 |
## Training Configuration
|
| 143 |
|
|
@@ -176,9 +176,10 @@ vllm serve arcee-ai/Trinity-Large-Thinking \
|
|
| 176 |
--enable-auto-tool-choice \
|
| 177 |
--tool-call-parser qwen3_coder
|
| 178 |
```
|
|
|
|
| 179 |
|
| 180 |
This configuration:
|
| 181 |
-
- `--reasoning-parser deepseek_r1` β Parses `<think>...</think>` reasoning blocks and exposes them via the `
|
| 182 |
- `--tool-call-parser qwen3_coder` β Parses structured tool calls from the model output into the OpenAI-compatible `tool_calls` array
|
| 183 |
|
| 184 |
|
|
@@ -261,10 +262,10 @@ while True:
|
|
| 261 |
)
|
| 262 |
msg = response.choices[0].message
|
| 263 |
|
| 264 |
-
# Build assistant message β PRESERVE
|
| 265 |
assistant_msg = {"role": "assistant", "content": msg.content}
|
| 266 |
if msg.reasoning_content:
|
| 267 |
-
assistant_msg["
|
| 268 |
if msg.tool_calls:
|
| 269 |
assistant_msg["tool_calls"] = [
|
| 270 |
{"id": tc.id, "type": "function", "function": {"name": tc.function.name, "arguments": tc.function.arguments}}
|
|
@@ -293,10 +294,10 @@ Expected output:
|
|
| 293 |
|
| 294 |
The critical line is:
|
| 295 |
```python
|
| 296 |
-
assistant_msg["
|
| 297 |
```
|
| 298 |
|
| 299 |
-
The
|
| 300 |
|
| 301 |
### Transformers
|
| 302 |
|
|
@@ -375,7 +376,7 @@ Trinity-Large-Thinking is optimized for deployment as the reasoning backbone of
|
|
| 375 |
|
| 376 |
Trinity-Large-Thinking works as a drop-in brain for OpenClaw agents. Its native tool-calling format is compatible with OpenClaw's execution loop, and the extended reasoning enables reliable multi-step task completion β from email triage to code generation to meeting scheduling. Our 91.9% PinchBench score reflects real-world OpenClaw task performance.
|
| 377 |
|
| 378 |
-
**Deploying for OpenClaw users**: OpenClaw preserves full assistant turns across steps.
|
| 379 |
|
| 380 |
### Hermes Agent
|
| 381 |
|
|
@@ -386,13 +387,13 @@ Compatible with the Hermes Agent framework from Nous Research. Trinity-Large-Thi
|
|
| 386 |
For custom implementations, the key integration pattern is:
|
| 387 |
|
| 388 |
1. Send the user message with tool definitions
|
| 389 |
-
2. Receive the response with `
|
| 390 |
3. Execute the tool calls
|
| 391 |
-
4. Append the **full** assistant response (
|
| 392 |
5. Send the updated history back for the next step
|
| 393 |
6. Repeat until the model produces a final response without tool calls
|
| 394 |
|
| 395 |
-
> **Important**: Step 4 must include
|
| 396 |
|
| 397 |
## License
|
| 398 |
|
|
|
|
| 119 |
{
|
| 120 |
"message": {
|
| 121 |
"role": "assistant",
|
| 122 |
+
"reasoning_content": "The user wants flight information. I need to determine the date for next Tuesday, search for flights SFO β JFK, and filter by price < $300.",
|
| 123 |
"content": "\n",
|
| 124 |
"tool_calls": [{
|
| 125 |
"function": {
|
|
|
|
| 133 |
|
| 134 |
### Preserving reasoning in multi-turn conversations
|
| 135 |
|
| 136 |
+
When building multi-turn agentic loops, you **must** pass `reasoning_content` back on assistant messages in subsequent requests. The chat template reads this field and re-wraps it in `<think>...</think>` tags during tokenization, maintaining the model's chain-of-thought across turns.
|
| 137 |
|
| 138 |
+
**What happens if reasoning is omitted entirely?** The model can lose prior chain-of-thought context. On simple tasks this may work fine, but on complex multi-step agentic tasks, the model can produce malformed tool calls (e.g., tool call XML appearing inside the reasoning field instead of as structured `tool_calls`). For best results, always preserve `reasoning_content` and use `""` instead of `null` for content on tool-call turns.
|
| 139 |
|
| 140 |
+
For implementation details, pitfalls (`reasoning` vs `reasoning_content`), and Python/TypeScript examples, see [Reasoning Traces](https://docs.arcee.ai/capabilities/reasoning-traces).
|
| 141 |
|
| 142 |
## Training Configuration
|
| 143 |
|
|
|
|
| 176 |
--enable-auto-tool-choice \
|
| 177 |
--tool-call-parser qwen3_coder
|
| 178 |
```
|
| 179 |
+
**Recommended inference settings**: `temperature=0.45β0.6`, `top_p=0.95`, `top_k=50`
|
| 180 |
|
| 181 |
This configuration:
|
| 182 |
+
- `--reasoning-parser deepseek_r1` β Parses `<think>...</think>` reasoning blocks and exposes them via the `reasoning_content` field in the API response
|
| 183 |
- `--tool-call-parser qwen3_coder` β Parses structured tool calls from the model output into the OpenAI-compatible `tool_calls` array
|
| 184 |
|
| 185 |
|
|
|
|
| 262 |
)
|
| 263 |
msg = response.choices[0].message
|
| 264 |
|
| 265 |
+
# Build assistant message β PRESERVE reasoning_content
|
| 266 |
assistant_msg = {"role": "assistant", "content": msg.content}
|
| 267 |
if msg.reasoning_content:
|
| 268 |
+
assistant_msg["reasoning_content"] = msg.reasoning_content # β critical for multi-turn
|
| 269 |
if msg.tool_calls:
|
| 270 |
assistant_msg["tool_calls"] = [
|
| 271 |
{"id": tc.id, "type": "function", "function": {"name": tc.function.name, "arguments": tc.function.arguments}}
|
|
|
|
| 294 |
|
| 295 |
The critical line is:
|
| 296 |
```python
|
| 297 |
+
assistant_msg["reasoning_content"] = msg.reasoning_content # β pass reasoning_content back
|
| 298 |
```
|
| 299 |
|
| 300 |
+
The chat template re-wraps it in `<think>...</think>` tags automatically. See [Reasoning Traces](https://docs.arcee.ai/capabilities/reasoning-traces) for full details.
|
| 301 |
|
| 302 |
### Transformers
|
| 303 |
|
|
|
|
| 376 |
|
| 377 |
Trinity-Large-Thinking works as a drop-in brain for OpenClaw agents. Its native tool-calling format is compatible with OpenClaw's execution loop, and the extended reasoning enables reliable multi-step task completion β from email triage to code generation to meeting scheduling. Our 91.9% PinchBench score reflects real-world OpenClaw task performance.
|
| 378 |
|
| 379 |
+
**Deploying for OpenClaw users**: OpenClaw preserves full assistant turns across steps. Ensure `reasoning_content` is forwarded on assistant messages in subsequent turns, and keep `content` non-null (empty string `""` is fine on tool-call turns). See [Reasoning Traces](https://docs.arcee.ai/capabilities/reasoning-traces) for full integration details.
|
| 380 |
|
| 381 |
### Hermes Agent
|
| 382 |
|
|
|
|
| 387 |
For custom implementations, the key integration pattern is:
|
| 388 |
|
| 389 |
1. Send the user message with tool definitions
|
| 390 |
+
2. Receive the response with `reasoning_content` + `content` + `tool_calls`
|
| 391 |
3. Execute the tool calls
|
| 392 |
+
4. Append the **full** assistant response (reasoning_content + content + tool calls) and tool results to the message history
|
| 393 |
5. Send the updated history back for the next step
|
| 394 |
6. Repeat until the model produces a final response without tool calls
|
| 395 |
|
| 396 |
+
> **Important**: Step 4 must include `reasoning_content` on the assistant message. The chat template reads this field and re-wraps it in `<think>...</think>` tags during tokenization. Omitting it degrades multi-step performance β see [Reasoning Traces](https://docs.arcee.ai/capabilities/reasoning-traces) for full details.
|
| 397 |
|
| 398 |
## License
|
| 399 |
|