EtashGuha commited on
Commit
65d8a62
Β·
verified Β·
1 Parent(s): 8bf17f1

Standardize card (v1-consistent)

Browse files
Files changed (1) hide show
  1. README.md +39 -23
README.md CHANGED
@@ -5,6 +5,9 @@ datasets:
5
  - open-thoughts/OpenThoughts-Agent-SFT-100K
6
  library_name: transformers
7
  license: apache-2.0
 
 
 
8
  pipeline_tag: text-generation
9
  tags:
10
  - agents
@@ -12,35 +15,38 @@ tags:
12
  - code
13
  - software-engineering
14
  ---
 
 
 
15
 
16
  <p align="center">
17
  <a href="https://www.openthoughts.ai/blog/agent" style="margin-right: 24px;">Project</a> |
18
  <a href="https://github.com/open-thoughts/OpenThoughts-Agent" style="margin-right: 24px; margin-left: 24px;">Code</a> |
19
- <a href="https://huggingface.co/datasets/open-thoughts/OpenThoughts-Agent-SFT-100K" style="margin-left: 24px;">Training data</a> |
20
  <a href="https://huggingface.co/collections/open-thoughts/openthinker-agent" style="margin-left: 24px;">Collection</a>
21
  </p>
22
 
 
23
  # OpenThinkerAgent-32B
24
 
25
- **OpenThoughts-Agent** is an open effort to curate the best data for training agentic
26
- language models. **OpenThinkerAgent-32B** is the 32B model from the SFT scaling ladder, fine-tuned
27
- from [Qwen/Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B) on the **100,000-example**
28
- [OpenThoughts-Agent-SFT-100K](https://huggingface.co/datasets/open-thoughts/OpenThoughts-Agent-SFT-100K)
29
- dataset (Top-4 task sources, GLM-4.7-AWQ teacher in the terminus-2 harness, β‰₯5-turn filter).
30
 
31
- ## Performance
32
 
33
- Evaluated in the **terminus-2** harness (pass@1, 3 stochastic re-runs):
 
34
 
35
- | Benchmark | Accuracy |
36
- | --- | --- |
37
- | SWE-Bench-Verified-100 | 55.7 |
38
- | OpenThoughts-TBLite | 41.3 |
39
- | Terminal-Bench 2.0 | 26.2 |
40
 
41
- ### Full benchmark suite (OpenThinkerAgent-32B, best harness)
42
 
43
- | Benchmark | Acc |
 
 
 
 
 
 
 
44
  | --- | --- |
45
  | SWE-Bench-Verified | 54.0 |
46
  | Terminal-Bench 2.0 | 26.2 |
@@ -51,19 +57,29 @@ Evaluated in the **terminus-2** harness (pass@1, 3 stochastic re-runs):
51
  | FinanceAgent-Terminal | 44.0 |
52
  | **Average (7)** | **44.8** |
53
 
54
- This is the best open-data 32B model on the average of seven agentic benchmarks.
 
 
 
 
 
 
 
 
 
 
55
 
56
- ## Links
57
- - 🌐 [Project](https://www.openthoughts.ai/blog/agent)
58
- - πŸ’» [Code](https://github.com/open-thoughts/OpenThoughts-Agent)
59
- - 🧠 [Training dataset](https://huggingface.co/datasets/open-thoughts/OpenThoughts-Agent-SFT-100K)
60
- - πŸ“š [Collection](https://huggingface.co/collections/open-thoughts/openthinker-agent)
61
 
62
- ## Citation
63
  ```
64
  @misc{openthoughts-agent,
65
  author = {Team, OpenThoughts-Agent},
66
- title = {OpenThoughts-Agent: Data Recipes for Agentic Models},
67
  howpublished = {https://www.openthoughts.ai/blog/agent},
68
  year = {2026}
69
  }
 
5
  - open-thoughts/OpenThoughts-Agent-SFT-100K
6
  library_name: transformers
7
  license: apache-2.0
8
+ model-index:
9
+ - name: OpenThinkerAgent-32B
10
+ results: []
11
  pipeline_tag: text-generation
12
  tags:
13
  - agents
 
15
  - code
16
  - software-engineering
17
  ---
18
+ <p align="center">
19
+ <img src="https://huggingface.co/datasets/open-thoughts/OpenThoughts1-Agent-SFT/resolve/main/ota-logo.png" width="50%">
20
+ </p>
21
 
22
  <p align="center">
23
  <a href="https://www.openthoughts.ai/blog/agent" style="margin-right: 24px;">Project</a> |
24
  <a href="https://github.com/open-thoughts/OpenThoughts-Agent" style="margin-right: 24px; margin-left: 24px;">Code</a> |
 
25
  <a href="https://huggingface.co/collections/open-thoughts/openthinker-agent" style="margin-left: 24px;">Collection</a>
26
  </p>
27
 
28
+
29
  # OpenThinkerAgent-32B
30
 
31
+ **OpenThoughts-Agent** is an open-source effort to curate the best datasets for training agents. Our release includes [datasets](https://huggingface.co/collections/open-thoughts/openthinker-agent), [models](https://huggingface.co/collections/open-thoughts/openthinker-agent) and our [research codebase](https://github.com/open-thoughts/OpenThoughts-Agent).
 
 
 
 
32
 
33
+ [OpenThinkerAgent-32B](https://huggingface.co/open-thoughts/OpenThinkerAgent-32B) is post-trained from [Qwen/Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B) with full-parameter SFT on the **100,000-example** [OpenThoughts-Agent-SFT-100K](https://huggingface.co/datasets/open-thoughts/OpenThoughts-Agent-SFT-100K) dataset (Top-4 task sources, GLM-4.7-AWQ teacher in the terminus-2 harness, β‰₯5-turn trace filter). It is the flagship **OpenThinkerAgent-32B**, the strongest open-data 32B model on the average of seven agentic benchmarks.
34
 
35
+ - **Homepage:** https://www.openthoughts.ai/blog/agent
36
+ - **Repository:** https://github.com/open-thoughts/OpenThoughts-Agent
37
 
38
+ # Performance
 
 
 
 
39
 
40
+ Evaluated in the **terminus-2** harness (pass@1, mean over 3 stochastic re-runs):
41
 
42
+ | Model | Harness | SWE-Bench-Verified-100 | OpenThoughts-TBLite | Terminal-Bench 2.0 |
43
+ | --- | --- | --- | --- | --- |
44
+ | [Qwen/Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B) | Terminus-2 | 26.7 | 13.7 | 7.5 |
45
+ | **OpenThinkerAgent-32B** | Terminus-2 | 55.7 | 41.3 | 26.2 |
46
+
47
+ Across the full seven-benchmark suite (best harness per benchmark), **OpenThinkerAgent-32B** is the strongest open-data model at the 32B scale:
48
+
49
+ | Benchmark | Accuracy |
50
  | --- | --- |
51
  | SWE-Bench-Verified | 54.0 |
52
  | Terminal-Bench 2.0 | 26.2 |
 
57
  | FinanceAgent-Terminal | 44.0 |
58
  | **Average (7)** | **44.8** |
59
 
60
+ # Data
61
+
62
+ The model is trained on [OpenThoughts-Agent-SFT-100K](https://huggingface.co/datasets/open-thoughts/OpenThoughts-Agent-SFT-100K): (task, agent-trajectory) pairs from the Top-4 task sources (SWE-Smith, StackExchange-SuperUser, StackExchange-Tezos with synthetic augmentation, IssueTasks). Trajectories are generated by GLM-4.7-AWQ in the terminus-2 harness and filtered to traces with at least 5 model turns.
63
+
64
+ ### Training hyperparameters
65
+ - learning_rate: 4e-05
66
+ - lr_scheduler_type: cosine, warmup_ratio 0.1
67
+ - global_batch_size: 96
68
+ - num_epochs: 5
69
+ - cutoff_len: 32768
70
+ - precision: bf16, DeepSpeed ZeRO-3
71
 
72
+ # Links
73
+ - 🌐 [OpenThoughts-Agent project page](https://www.openthoughts.ai/blog/agent)
74
+ - πŸ’» [OpenThoughts-Agent GitHub repository](https://github.com/open-thoughts/OpenThoughts-Agent)
75
+ - πŸ“š [OpenThinker-Agent collection](https://huggingface.co/collections/open-thoughts/openthinker-agent)
76
+ - 🧠 [Training dataset: OpenThoughts-Agent-SFT-100K](https://huggingface.co/datasets/open-thoughts/OpenThoughts-Agent-SFT-100K)
77
 
78
+ # Citation
79
  ```
80
  @misc{openthoughts-agent,
81
  author = {Team, OpenThoughts-Agent},
82
+ title = {{OpenThoughts-Agent: Data Recipes for Agentic Models}},
83
  howpublished = {https://www.openthoughts.ai/blog/agent},
84
  year = {2026}
85
  }