Improve language tag

#2
by lbourdois - opened
Files changed (1) hide show
  1. README.md +237 -225
README.md CHANGED
@@ -1,225 +1,237 @@
1
- ---
2
- license: apache-2.0
3
- language:
4
- - en
5
- base_model:
6
- - Qwen/Qwen2.5-7B-Instruct
7
- pipeline_tag: text-generation
8
- library_name: transformers
9
- tags:
10
- - opus
11
- - code
12
- - cot
13
- - lcot
14
- - LlaMa
15
- model-index:
16
- - name: Taurus-Opus-7B
17
- results:
18
- - task:
19
- type: text-generation
20
- name: Text Generation
21
- dataset:
22
- name: IFEval (0-Shot)
23
- type: wis-k/instruction-following-eval
24
- split: train
25
- args:
26
- num_few_shot: 0
27
- metrics:
28
- - type: inst_level_strict_acc and prompt_level_strict_acc
29
- value: 42.23
30
- name: averaged accuracy
31
- source:
32
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FTaurus-Opus-7B
33
- name: Open LLM Leaderboard
34
- - task:
35
- type: text-generation
36
- name: Text Generation
37
- dataset:
38
- name: BBH (3-Shot)
39
- type: SaylorTwift/bbh
40
- split: test
41
- args:
42
- num_few_shot: 3
43
- metrics:
44
- - type: acc_norm
45
- value: 34.23
46
- name: normalized accuracy
47
- source:
48
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FTaurus-Opus-7B
49
- name: Open LLM Leaderboard
50
- - task:
51
- type: text-generation
52
- name: Text Generation
53
- dataset:
54
- name: MATH Lvl 5 (4-Shot)
55
- type: lighteval/MATH-Hard
56
- split: test
57
- args:
58
- num_few_shot: 4
59
- metrics:
60
- - type: exact_match
61
- value: 22.73
62
- name: exact match
63
- source:
64
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FTaurus-Opus-7B
65
- name: Open LLM Leaderboard
66
- - task:
67
- type: text-generation
68
- name: Text Generation
69
- dataset:
70
- name: GPQA (0-shot)
71
- type: Idavidrein/gpqa
72
- split: train
73
- args:
74
- num_few_shot: 0
75
- metrics:
76
- - type: acc_norm
77
- value: 10.18
78
- name: acc_norm
79
- source:
80
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FTaurus-Opus-7B
81
- name: Open LLM Leaderboard
82
- - task:
83
- type: text-generation
84
- name: Text Generation
85
- dataset:
86
- name: MuSR (0-shot)
87
- type: TAUR-Lab/MuSR
88
- args:
89
- num_few_shot: 0
90
- metrics:
91
- - type: acc_norm
92
- value: 14.22
93
- name: acc_norm
94
- source:
95
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FTaurus-Opus-7B
96
- name: Open LLM Leaderboard
97
- - task:
98
- type: text-generation
99
- name: Text Generation
100
- dataset:
101
- name: MMLU-PRO (5-shot)
102
- type: TIGER-Lab/MMLU-Pro
103
- config: main
104
- split: test
105
- args:
106
- num_few_shot: 5
107
- metrics:
108
- - type: acc
109
- value: 32.79
110
- name: accuracy
111
- source:
112
- url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FTaurus-Opus-7B
113
- name: Open LLM Leaderboard
114
- ---
115
-
116
- # **Taurus-Opus-7B**
117
-
118
- Taurus-Opus-7B is built upon the LLaMA (Large Language Model Meta AI) 7B architecture, optimized to provide advanced reasoning capabilities while maintaining efficiency. With 7 billion parameters, it strikes a balance between performance and computational resource requirements. The model has been fine-tuned with a focus on chain-of-thought (CoT) reasoning, leveraging specialized datasets to enhance its problem-solving abilities. Taurus-Opus-7B is designed for tasks requiring logical reasoning, detailed explanations, and multi-step problem-solving, making it ideal for applications such as instruction-following, text generation, and coding assistance.
119
-
120
-
121
- # **Key Features and Improvements**
122
-
123
- 1. **Optimized Reasoning Capabilities**:
124
- The model showcases significant improvements in context understanding, reasoning, and mathematical problem-solving through fine-tuning with long CoT datasets.
125
-
126
- 2. **Enhanced Instruction Following**:
127
- Taurus-Opus-7B excels in generating long, coherent outputs (up to 4K tokens), understanding structured data, and producing structured outputs like JSON.
128
-
129
- 3. **Lightweight Efficiency**:
130
- Its 7B parameter size makes it more resource-efficient compared to larger models while retaining high-quality performance for reasoning and content generation tasks.
131
-
132
- 4. **Long-Context Support**:
133
- Offers support for long contexts of up to 64K tokens, enabling the handling of large datasets or extended conversations.
134
-
135
- 5. **Multilingual Proficiency**:
136
- The model supports 20+ languages, including English, Spanish, French, German, Portuguese, Chinese, Japanese, and more, making it suitable for global applications.
137
-
138
- # **Quickstart with transformers**
139
-
140
- Here’s a code snippet to load **Taurus-Opus-7B** using the `transformers` library:
141
-
142
- ```python
143
- from transformers import AutoModelForCausalLM, AutoTokenizer
144
-
145
- model_name = "prithivMLmods/Taurus-Opus-7B"
146
-
147
- model = AutoModelForCausalLM.from_pretrained(
148
- model_name,
149
- torch_dtype="auto",
150
- device_map="auto"
151
- )
152
- tokenizer = AutoTokenizer.from_pretrained(model_name)
153
-
154
- prompt = "Explain the importance of chain-of-thought reasoning in large language models."
155
- messages = [
156
- {"role": "system", "content": "You are a helpful assistant with expertise in logical reasoning and problem-solving."},
157
- {"role": "user", "content": prompt}
158
- ]
159
- text = tokenizer.apply_chat_template(
160
- messages,
161
- tokenize=False,
162
- add_generation_prompt=True
163
- )
164
- model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
165
-
166
- generated_ids = model.generate(
167
- **model_inputs,
168
- max_new_tokens=512
169
- )
170
- generated_ids = [
171
- output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
172
- ]
173
-
174
- response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
175
- ```
176
- # **Intended Use**
177
-
178
- 1. **Reasoning and Context Understanding**:
179
- Taurus-Opus-7B is tailored for complex reasoning tasks, contextual understanding, and solving problems requiring logical deduction.
180
-
181
- 2. **Mathematical Problem-Solving**:
182
- Designed for advanced mathematical reasoning and calculations, making it valuable for education, research, and engineering tasks.
183
-
184
- 3. **Code Assistance**:
185
- Provides robust coding support, including writing, debugging, and optimizing code across multiple programming languages.
186
-
187
- 4. **Data Analysis**:
188
- Excels in analyzing structured data and generating structured outputs, aiding automation workflows and data-driven insights.
189
-
190
- 5. **Multilingual Support**:
191
- Facilitates applications such as multilingual chatbots, content generation, and translation in 20+ languages.
192
-
193
- 6. **Extended Content Generation**:
194
- Suitable for generating detailed reports, articles, and instructional guides, handling outputs up to 4K tokens.
195
-
196
- # **Limitations**
197
-
198
- 1. **Hardware Requirements**:
199
- While more efficient than larger models, Taurus-Opus-7B still requires high-memory GPUs or TPUs for optimal performance.
200
-
201
- 2. **Language Quality Variations**:
202
- Output quality may vary across supported languages, especially for less commonly used languages.
203
-
204
- 3. **Creativity Limitations**:
205
- The model may sometimes generate repetitive or inconsistent results in creative or highly subjective tasks.
206
-
207
- 4. **Real-Time Knowledge Constraints**:
208
- The model lacks awareness of events or knowledge updates beyond its training data.
209
-
210
- 5. **Prompt Dependency**:
211
- Results heavily depend on the specificity and clarity of input prompts, requiring well-structured queries for the best performance.
212
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
213
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/prithivMLmods__Taurus-Opus-7B-details)!
214
- Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=prithivMLmods%2FTaurus-Opus-7B&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!
215
-
216
- | Metric |Value (%)|
217
- |-------------------|--------:|
218
- |**Average** | 26.06|
219
- |IFEval (0-Shot) | 42.23|
220
- |BBH (3-Shot) | 34.23|
221
- |MATH Lvl 5 (4-Shot)| 22.73|
222
- |GPQA (0-shot) | 10.18|
223
- |MuSR (0-shot) | 14.22|
224
- |MMLU-PRO (5-shot) | 32.79|
225
-
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - zho
5
+ - eng
6
+ - fra
7
+ - spa
8
+ - por
9
+ - deu
10
+ - ita
11
+ - rus
12
+ - jpn
13
+ - kor
14
+ - vie
15
+ - tha
16
+ - ara
17
+ base_model:
18
+ - Qwen/Qwen2.5-7B-Instruct
19
+ pipeline_tag: text-generation
20
+ library_name: transformers
21
+ tags:
22
+ - opus
23
+ - code
24
+ - cot
25
+ - lcot
26
+ - LlaMa
27
+ model-index:
28
+ - name: Taurus-Opus-7B
29
+ results:
30
+ - task:
31
+ type: text-generation
32
+ name: Text Generation
33
+ dataset:
34
+ name: IFEval (0-Shot)
35
+ type: wis-k/instruction-following-eval
36
+ split: train
37
+ args:
38
+ num_few_shot: 0
39
+ metrics:
40
+ - type: inst_level_strict_acc and prompt_level_strict_acc
41
+ value: 42.23
42
+ name: averaged accuracy
43
+ source:
44
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FTaurus-Opus-7B
45
+ name: Open LLM Leaderboard
46
+ - task:
47
+ type: text-generation
48
+ name: Text Generation
49
+ dataset:
50
+ name: BBH (3-Shot)
51
+ type: SaylorTwift/bbh
52
+ split: test
53
+ args:
54
+ num_few_shot: 3
55
+ metrics:
56
+ - type: acc_norm
57
+ value: 34.23
58
+ name: normalized accuracy
59
+ source:
60
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FTaurus-Opus-7B
61
+ name: Open LLM Leaderboard
62
+ - task:
63
+ type: text-generation
64
+ name: Text Generation
65
+ dataset:
66
+ name: MATH Lvl 5 (4-Shot)
67
+ type: lighteval/MATH-Hard
68
+ split: test
69
+ args:
70
+ num_few_shot: 4
71
+ metrics:
72
+ - type: exact_match
73
+ value: 22.73
74
+ name: exact match
75
+ source:
76
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FTaurus-Opus-7B
77
+ name: Open LLM Leaderboard
78
+ - task:
79
+ type: text-generation
80
+ name: Text Generation
81
+ dataset:
82
+ name: GPQA (0-shot)
83
+ type: Idavidrein/gpqa
84
+ split: train
85
+ args:
86
+ num_few_shot: 0
87
+ metrics:
88
+ - type: acc_norm
89
+ value: 10.18
90
+ name: acc_norm
91
+ source:
92
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FTaurus-Opus-7B
93
+ name: Open LLM Leaderboard
94
+ - task:
95
+ type: text-generation
96
+ name: Text Generation
97
+ dataset:
98
+ name: MuSR (0-shot)
99
+ type: TAUR-Lab/MuSR
100
+ args:
101
+ num_few_shot: 0
102
+ metrics:
103
+ - type: acc_norm
104
+ value: 14.22
105
+ name: acc_norm
106
+ source:
107
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FTaurus-Opus-7B
108
+ name: Open LLM Leaderboard
109
+ - task:
110
+ type: text-generation
111
+ name: Text Generation
112
+ dataset:
113
+ name: MMLU-PRO (5-shot)
114
+ type: TIGER-Lab/MMLU-Pro
115
+ config: main
116
+ split: test
117
+ args:
118
+ num_few_shot: 5
119
+ metrics:
120
+ - type: acc
121
+ value: 32.79
122
+ name: accuracy
123
+ source:
124
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=prithivMLmods%2FTaurus-Opus-7B
125
+ name: Open LLM Leaderboard
126
+ ---
127
+
128
+ # **Taurus-Opus-7B**
129
+
130
+ Taurus-Opus-7B is built upon the LLaMA (Large Language Model Meta AI) 7B architecture, optimized to provide advanced reasoning capabilities while maintaining efficiency. With 7 billion parameters, it strikes a balance between performance and computational resource requirements. The model has been fine-tuned with a focus on chain-of-thought (CoT) reasoning, leveraging specialized datasets to enhance its problem-solving abilities. Taurus-Opus-7B is designed for tasks requiring logical reasoning, detailed explanations, and multi-step problem-solving, making it ideal for applications such as instruction-following, text generation, and coding assistance.
131
+
132
+
133
+ # **Key Features and Improvements**
134
+
135
+ 1. **Optimized Reasoning Capabilities**:
136
+ The model showcases significant improvements in context understanding, reasoning, and mathematical problem-solving through fine-tuning with long CoT datasets.
137
+
138
+ 2. **Enhanced Instruction Following**:
139
+ Taurus-Opus-7B excels in generating long, coherent outputs (up to 4K tokens), understanding structured data, and producing structured outputs like JSON.
140
+
141
+ 3. **Lightweight Efficiency**:
142
+ Its 7B parameter size makes it more resource-efficient compared to larger models while retaining high-quality performance for reasoning and content generation tasks.
143
+
144
+ 4. **Long-Context Support**:
145
+ Offers support for long contexts of up to 64K tokens, enabling the handling of large datasets or extended conversations.
146
+
147
+ 5. **Multilingual Proficiency**:
148
+ The model supports 20+ languages, including English, Spanish, French, German, Portuguese, Chinese, Japanese, and more, making it suitable for global applications.
149
+
150
+ # **Quickstart with transformers**
151
+
152
+ Here’s a code snippet to load **Taurus-Opus-7B** using the `transformers` library:
153
+
154
+ ```python
155
+ from transformers import AutoModelForCausalLM, AutoTokenizer
156
+
157
+ model_name = "prithivMLmods/Taurus-Opus-7B"
158
+
159
+ model = AutoModelForCausalLM.from_pretrained(
160
+ model_name,
161
+ torch_dtype="auto",
162
+ device_map="auto"
163
+ )
164
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
165
+
166
+ prompt = "Explain the importance of chain-of-thought reasoning in large language models."
167
+ messages = [
168
+ {"role": "system", "content": "You are a helpful assistant with expertise in logical reasoning and problem-solving."},
169
+ {"role": "user", "content": prompt}
170
+ ]
171
+ text = tokenizer.apply_chat_template(
172
+ messages,
173
+ tokenize=False,
174
+ add_generation_prompt=True
175
+ )
176
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
177
+
178
+ generated_ids = model.generate(
179
+ **model_inputs,
180
+ max_new_tokens=512
181
+ )
182
+ generated_ids = [
183
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
184
+ ]
185
+
186
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
187
+ ```
188
+ # **Intended Use**
189
+
190
+ 1. **Reasoning and Context Understanding**:
191
+ Taurus-Opus-7B is tailored for complex reasoning tasks, contextual understanding, and solving problems requiring logical deduction.
192
+
193
+ 2. **Mathematical Problem-Solving**:
194
+ Designed for advanced mathematical reasoning and calculations, making it valuable for education, research, and engineering tasks.
195
+
196
+ 3. **Code Assistance**:
197
+ Provides robust coding support, including writing, debugging, and optimizing code across multiple programming languages.
198
+
199
+ 4. **Data Analysis**:
200
+ Excels in analyzing structured data and generating structured outputs, aiding automation workflows and data-driven insights.
201
+
202
+ 5. **Multilingual Support**:
203
+ Facilitates applications such as multilingual chatbots, content generation, and translation in 20+ languages.
204
+
205
+ 6. **Extended Content Generation**:
206
+ Suitable for generating detailed reports, articles, and instructional guides, handling outputs up to 4K tokens.
207
+
208
+ # **Limitations**
209
+
210
+ 1. **Hardware Requirements**:
211
+ While more efficient than larger models, Taurus-Opus-7B still requires high-memory GPUs or TPUs for optimal performance.
212
+
213
+ 2. **Language Quality Variations**:
214
+ Output quality may vary across supported languages, especially for less commonly used languages.
215
+
216
+ 3. **Creativity Limitations**:
217
+ The model may sometimes generate repetitive or inconsistent results in creative or highly subjective tasks.
218
+
219
+ 4. **Real-Time Knowledge Constraints**:
220
+ The model lacks awareness of events or knowledge updates beyond its training data.
221
+
222
+ 5. **Prompt Dependency**:
223
+ Results heavily depend on the specificity and clarity of input prompts, requiring well-structured queries for the best performance.
224
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
225
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/prithivMLmods__Taurus-Opus-7B-details)!
226
+ Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=prithivMLmods%2FTaurus-Opus-7B&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!
227
+
228
+ | Metric |Value (%)|
229
+ |-------------------|--------:|
230
+ |**Average** | 26.06|
231
+ |IFEval (0-Shot) | 42.23|
232
+ |BBH (3-Shot) | 34.23|
233
+ |MATH Lvl 5 (4-Shot)| 22.73|
234
+ |GPQA (0-shot) | 10.18|
235
+ |MuSR (0-shot) | 14.22|
236
+ |MMLU-PRO (5-shot) | 32.79|
237
+