Text Generation
Transformers
Safetensors
granite
code
qiskit
conversational

Adding metrics for new benchmarks

#6
by tidealwari - opened
Files changed (1) hide show
  1. README.md +53 -3
README.md CHANGED
@@ -99,36 +99,86 @@ for i in output:
99
  <th style="text-align:center; padding:12px 16px; background:linear-gradient(90deg,#f6f8fb,#eef3f9); color:#0b1220; font-weight:700; border-bottom:1px solid rgba(15,23,42,0.06);">
100
  HumanEval
101
  </th>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
102
  </tr>
103
  </thead>
104
  <tbody>
105
  <tr style="background:#f7fafc;">
106
  <td style="padding:12px 16px; font-weight:700; color:#07102a;">Qwen2.5-Coder-14B-Qiskit</td>
107
- <td style="padding:12px 16px; text-align:center; font-weight:700; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">25.16</td>
108
  <td style="padding:12px 16px; text-align:center; font-weight:700; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">49.01</td>
109
  <td style="padding:12px 16px; text-align:center; font-weight:700; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">91.46</td>
 
 
 
 
 
 
 
110
  </tr>
111
  <tr style="background:#ffffff;">
112
  <td style="padding:12px 16px; color:#0f172a;">mistral-small-3.2-24b-qiskit</td>
113
  <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">20.53</td>
114
  <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">40.39</td>
115
  <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">77.49</td>
 
 
 
 
 
 
 
116
  </tr>
117
  <tr style="background:#ffffff;">
118
  <td style="padding:12px 16px; color:#0f172a;">granite-3.3-8b-qiskit</td>
119
- <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">14.56</td>
120
  <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">27.15</td>
121
  <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">62.80</td>
 
 
 
 
 
 
 
122
  </tr>
123
  <tr style="background:#fbfdff;">
124
  <td style="padding:12px 16px; color:#0f172a;">granite-3.2-8b-qiskit</td>
125
  <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">9.93</td>
126
  <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">24.50</td>
127
- <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">57.31</td>
 
 
 
 
 
 
 
128
  </tr>
129
  </tbody>
130
  </table>
131
 
 
132
 
133
  ## Training Data
134
 
 
99
  <th style="text-align:center; padding:12px 16px; background:linear-gradient(90deg,#f6f8fb,#eef3f9); color:#0b1220; font-weight:700; border-bottom:1px solid rgba(15,23,42,0.06);">
100
  HumanEval
101
  </th>
102
+ <th style="text-align:center; padding:12px 16px; background:linear-gradient(90deg,#f6f8fb,#eef3f9); color:#0b1220; font-weight:700; border-bottom:1px solid rgba(15,23,42,0.06);">
103
+ ASDiv
104
+ </th>
105
+ <th style="text-align:center; padding:12px 16px; background:linear-gradient(90deg,#f6f8fb,#eef3f9); color:#0b1220; font-weight:700; border-bottom:1px solid rgba(15,23,42,0.06);">
106
+ MathQA
107
+ </th>
108
+ <th style="text-align:center; padding:12px 16px; background:linear-gradient(90deg,#f6f8fb,#eef3f9); color:#0b1220; font-weight:700; border-bottom:1px solid rgba(15,23,42,0.06);">
109
+ SciQ
110
+ </th>
111
+ <th style="text-align:center; padding:12px 16px; background:linear-gradient(90deg,#f6f8fb,#eef3f9); color:#0b1220; font-weight:700; border-bottom:1px solid rgba(15,23,42,0.06);">
112
+ MBPP
113
+ </th>
114
+ <th style="text-align:center; padding:12px 16px; background:linear-gradient(90deg,#f6f8fb,#eef3f9); color:#0b1220; font-weight:700; border-bottom:1px solid rgba(15,23,42,0.06);">
115
+ IFEval
116
+ </th>
117
+ <th style="text-align:center; padding:12px 16px; background:linear-gradient(90deg,#f6f8fb,#eef3f9); color:#0b1220; font-weight:700; border-bottom:1px solid rgba(15,23,42,0.06);">
118
+ CrowsPairs (English)
119
+ </th>
120
+ <th style="text-align:center; padding:12px 16px; background:linear-gradient(90deg,#f6f8fb,#eef3f9); color:#0b1220; font-weight:700; border-bottom:1px solid rgba(15,23,42,0.06);">
121
+ TruthfulQA (MC1 acc)
122
+ </th>
123
  </tr>
124
  </thead>
125
  <tbody>
126
  <tr style="background:#f7fafc;">
127
  <td style="padding:12px 16px; font-weight:700; color:#07102a;">Qwen2.5-Coder-14B-Qiskit</td>
128
+ <td style="padding:12px 16px; text-align:center; font-weight:700; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">25.17</td>
129
  <td style="padding:12px 16px; text-align:center; font-weight:700; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">49.01</td>
130
  <td style="padding:12px 16px; text-align:center; font-weight:700; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">91.46</td>
131
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">4.21</td>
132
+ <td style="padding:12px 16px; text-align:center; font-weight:700; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">53.90</td>
133
+ <td style="padding:12px 16px; text-align:center; font-weight:700; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">97.00</td>
134
+ <td style="padding:12px 16px; text-align:center; font-weight:700; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">77.60</td>
135
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">49.64</td>
136
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">65.18</td>
137
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">37.82</td>
138
  </tr>
139
  <tr style="background:#ffffff;">
140
  <td style="padding:12px 16px; color:#0f172a;">mistral-small-3.2-24b-qiskit</td>
141
  <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">20.53</td>
142
  <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">40.39</td>
143
  <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">77.49</td>
144
+ <td style="padding:12px 16px; text-align:center; font-weight:700; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">20.69</td>
145
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">53.40</td>
146
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">96.40</td>
147
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">63.40</td>
148
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">31.66</td>
149
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">67.56</td>
150
+ <td style="padding:12px 16px; text-align:center; font-weight:700; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">42.84</td>
151
  </tr>
152
  <tr style="background:#ffffff;">
153
  <td style="padding:12px 16px; color:#0f172a;">granite-3.3-8b-qiskit</td>
154
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">14.57</td>
155
  <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">27.15</td>
156
  <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">62.80</td>
157
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">0.48</td>
158
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">38.66</td>
159
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">93.30</td>
160
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">52.40</td>
161
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">59.71</td>
162
+ <td style="padding:12px 16px; text-align:center; font-weight:700; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">59.75</td>
163
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">39.05</td>
164
  </tr>
165
  <tr style="background:#fbfdff;">
166
  <td style="padding:12px 16px; color:#0f172a;">granite-3.2-8b-qiskit</td>
167
  <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">9.93</td>
168
  <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">24.50</td>
169
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">57.32</td>
170
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">0.09</td>
171
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">41.41</td>
172
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">96.30</td>
173
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">51.80</td>
174
+ <td style="padding:12px 16px; text-align:center; font-weight:700; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">60.79</td>
175
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">66.79</td>
176
+ <td style="padding:12px 16px; text-align:center; font-family:ui-monospace, SFMono-Regular, Menlo, Monaco, monospace;">40.51</td>
177
  </tr>
178
  </tbody>
179
  </table>
180
 
181
+ *Note: All models listed in the benchmark table were evaluated using their respective system prompt, defined in their Hugging Face model.*
182
 
183
  ## Training Data
184