sedrickkeh commited on
Commit
885be8e
·
verified ·
1 Parent(s): 4098f4d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -18
README.md CHANGED
@@ -7,55 +7,91 @@ tags:
7
  - full
8
  - generated_from_trainer
9
  model-index:
10
- - name: hero_run_2_fix_conversations
11
  results: []
 
 
12
  ---
13
 
14
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
- should probably proofread and complete it, then remove this comment. -->
 
16
 
17
- # hero_run_2_fix_conversations
18
 
19
- This model is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) on the mlfoundations-dev/hero_run_2_fix_conversations dataset.
 
20
 
21
- ## Model description
 
 
22
 
23
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
- ## Intended uses & limitations
26
 
27
- More information needed
28
 
29
- ## Training and evaluation data
30
 
31
- More information needed
32
 
33
  ## Training procedure
34
 
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 8e-05
39
- - train_batch_size: 1
40
- - eval_batch_size: 8
41
  - seed: 42
42
  - distributed_type: multi-GPU
43
  - num_devices: 256
44
  - gradient_accumulation_steps: 2
45
  - total_train_batch_size: 512
46
- - total_eval_batch_size: 2048
47
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: cosine
49
  - lr_scheduler_warmup_ratio: 0.1
50
  - num_epochs: 5.0
51
 
52
- ### Training results
53
-
54
-
55
-
56
  ### Framework versions
57
 
58
  - Transformers 4.46.1
59
  - Pytorch 2.3.0
60
  - Datasets 3.1.0
61
  - Tokenizers 0.20.3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  - full
8
  - generated_from_trainer
9
  model-index:
10
+ - name: OpenThinker2-7B
11
  results: []
12
+ datasets:
13
+ - open-thoughts/OpenThoughts2-1M
14
  ---
15
 
16
+ <p align="center">
17
+ <img src="https://huggingface.co/datasets/open-thoughts/open-thoughts-114k/resolve/main/open_thoughts.png" width="50%">
18
+ </p>
19
 
20
+ # OpenThinker2-7B
21
 
22
+ This model is a fine-tuned version of [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) on the
23
+ [OpenThoughts2-1M](https://huggingface.co/datasets/open-thoughts/OpenThoughts2-1M) dataset.
24
 
25
+ The [OpenThinker2-7B](https://huggingface.co/open-thoughts/OpenThinker2-7B) model delivers performance comparable to state-of-the-art 7B models like DeepSeek-R1-Distill-7B, outperforming it on GPQA-D and LCBv2, and having comparable scores on AIME25, AMC23, and MATH500.
26
+ This model improves upon our previous [OpenThinker-7B](https://huggingface.co/open-thoughts/OpenThinker-7B) model, which was trained on 114k examples from [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/open-thoughts-114k).
27
+ The numbers reported in the table below are evaluated with our open-source tool [Evalchemy](https://github.com/mlfoundations/Evalchemy).
28
 
29
+ | Model | Open Data? | Avg | AIME24 | AIME25 | AMC23 | MATH500 | GPQA-D | LCBv2 |
30
+ | ---------------- | ----- | ---- | ------ | ------ | ----- | ------- | ------ | ----- |
31
+ | OpenThinker-7B | ✅ | 48.9 | 31.3 | 23.3 | 74.5 | 83.2 | 42.9 | 38.0 |
32
+ | OpenThinker2-7B | ✅ | 61.0 | 50.0 | 33.3 | 89.5 | 88.4 | 49.3 | 55.6 |
33
+ | R1-Distill-7B | ❌ | 61.3 | 57.3 | 33.3 | 92.0 | 89.6 | 47.3 | 48.4 |
34
+ | OlympicCoder-7B | ✅ | 42.4 | 20.7 | 15.3 | 63.0 | 74.8 | 25.3 | 55.4 |
35
+ | OpenR1-7B | ✅ | 48.4 | 48.7 | 34.7 | 88.5 | 87.8 | 21.2 | 9.5 |
36
+ | Nemotron-Nano-8B | ⚠️ | 69.0 | 61.3 | 45.3 | 94.0 | 89.0 | 55.9 | 68.4 |
37
+
38
+
39
+ ## Data
40
+
41
+ This model was trained on the [OpenThoughts2-1M](https://huggingface.co/datasets/open-thoughts/OpenThoughts2-1M) dataset.
42
+
43
+ The [OpenThoughts2-1M](https://huggingface.co/datasets/open-thoughts/OpenThoughts2-1M) dataset was constructed by augmenting [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/open-thoughts-114k) with existing datasets like [OpenR1](https://huggingface.co/open-r1), as well as additional math and code reasoning data.
44
+ We generate the additional math and code data by ablating on various question generation methodologies and sampling from the highest performing ones.
45
+
46
+ See the [OpenThoughts2-1M](https://huggingface.co/datasets/open-thoughts/OpenThoughts2-1M) dataset page or our [blog post]() for additional information.
47
 
 
48
 
49
+ ## Intended uses & limitations
50
 
51
+ Apache 2.0 License
52
 
 
53
 
54
  ## Training procedure
55
 
56
+ We used 32 8xA100 nodes to train the model for 36 hours.
57
+
58
  ### Training hyperparameters
59
 
60
  The following hyperparameters were used during training:
61
  - learning_rate: 8e-05
 
 
62
  - seed: 42
63
  - distributed_type: multi-GPU
64
  - num_devices: 256
65
  - gradient_accumulation_steps: 2
66
  - total_train_batch_size: 512
 
67
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
68
  - lr_scheduler_type: cosine
69
  - lr_scheduler_warmup_ratio: 0.1
70
  - num_epochs: 5.0
71
 
 
 
 
 
72
  ### Framework versions
73
 
74
  - Transformers 4.46.1
75
  - Pytorch 2.3.0
76
  - Datasets 3.1.0
77
  - Tokenizers 0.20.3
78
+
79
+ More info can be found in our repository: [https://github.com/open-thoughts/open-thoughts](https://github.com/open-thoughts/open-thoughts).
80
+
81
+ # Citation
82
+ ```
83
+ @misc{openthoughts,
84
+ author = {Team, OpenThoughts},
85
+ month = apr,
86
+ title = {{Open Thoughts}},
87
+ howpublished = {https://open-thoughts.ai},
88
+ year = {2025}
89
+ }
90
+ ```
91
+
92
+ # Links
93
+ - 📊 [OpenThoughts2 and OpenThinker2 Blog Post]()
94
+ - 💻 [Open Thoughts GitHub Repository](https://github.com/open-thoughts/open-thoughts)
95
+ - 🧠 [OpenThoughts2-1M dataset](https://huggingface.co/datasets/open-thoughts/OpenThoughts2-1M)
96
+ - 🤖 [OpenThinker2-7B model](https://huggingface.co/open-thoughts/OpenThinker2-7B) - this model.
97
+ - 🤖 [OpenThinker2-32B model](https://huggingface.co/open-thoughts/OpenThinker2-32B)