mkurman commited on
Commit
3cc10e8
·
verified ·
1 Parent(s): 8688c15

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -0
README.md CHANGED
@@ -85,6 +85,27 @@ _Note:_ Arcee’s internal evals may use different harnesses; avoid cross-harnes
85
  - **MMLU-Pro** increases difficulty (10 options; more reasoning-heavy); small deltas are still meaningful.
86
  - **IFEVAL** checks **verifiable** constraints (length, keyword counts, format, etc.).
87
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88
  ### Reproduce (example commands)
89
 
90
  ```bash
 
85
  - **MMLU-Pro** increases difficulty (10 options; more reasoning-heavy); small deltas are still meaningful.
86
  - **IFEVAL** checks **verifiable** constraints (length, keyword counts, format, etc.).
87
 
88
+
89
+ | mmlu | AFM-4.5B-OpenMed | AFM-4.5B |
90
+ | :-------------------- | :--------------- | :------- |
91
+ | **other** | | |
92
+ | clinical_knowledge | 67.55 | 65.66 |
93
+ | college_medicine | 64.74 | 54.34 |
94
+ | professional_medicine | 63.97 | 59.56 |
95
+ | virology | 49.4 | 48.19 |
96
+ | **stem** | | |
97
+ | anatomy | 62.96 | 56.3 |
98
+ | college_biology | 78.47 | 65.97 |
99
+ | college_chemistry | 44.00 | 37.00 |
100
+ | high_school_biology | 79.03 | 71.29 |
101
+ | high_school_chemistry | 53.2 | 43.84 |
102
+ | **groups** | | |
103
+ | humanities | 56.13 | 50.46 |
104
+ | other | 68.97 | 63.47 |
105
+ | social sciences | 73.25 | 68.61 |
106
+ | stem | 48.91 | 42.53 |
107
+
108
+
109
  ### Reproduce (example commands)
110
 
111
  ```bash