mzbac commited on
Commit
be19a15
·
verified ·
1 Parent(s): c210519

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -1
README.md CHANGED
@@ -2,7 +2,31 @@
2
  license: mit
3
  ---
4
 
5
- A Moe model was constructed using microsoft/phi-2, g-ronimo/phi-2-OpenHermes-2.5, and mlx-community/phi-2-dpo-7k as the foundation. Then qlorato all layers of q,v, and gate linear on WizardLM_evol_instruct_70k via mlx.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
 
7
  ## Example
8
  ```
 
2
  license: mit
3
  ---
4
 
5
+ The Moe model was constructed using microsoft/phi-2 as the base, with experts from microsoft/phi-2, g-ronimo/phi-2-OpenHermes-2.5, and mlx-community/phi-2-dpo-7k. Then qlora was applied to all layers of q,v, and gate linear on WizardLM_evol_instruct_70k via mlx.
6
+ The model was created using a script from https://github.com/mzbac/mlx-moe
7
+
8
+ ## Evaluation
9
+
10
+ ### hellaswag
11
+ | Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
12
+ |---------|------:|------|-----:|--------|-----:|---|-----:|
13
+ |hellaswag| 1|none | 0|acc |0.5482|± |0.0050|
14
+ | | |none | 0|acc_norm|0.7300|± |0.0044|
15
+ ### MMLU
16
+ | Groups |Version|Filter|n-shot|Metric|Value | |Stderr|
17
+ |------------------|-------|------|-----:|------|-----:|---|-----:|
18
+ | - humanities |N/A |none | 0|acc |0.5817|± |0.0247|
19
+ | - other |N/A |none | 0|acc |0.5795|± |0.0311|
20
+ | - social_sciences|N/A |none | 0|acc |0.6347|± |0.0292|
21
+ | - stem |N/A |none | 0|acc |0.4486|± |0.0376|
22
+ ### BBH
23
+ | Tasks |Version| Filter |n-shot| Metric |Value | |Stderr|
24
+ |----------------------|------:|----------|-----:|-----------|-----:|---|-----:|
25
+ |bbh_cot_fewshot_snarks| 2|get-answer| 3|exact_match|0.5281|± |0.0375|
26
+ ### GSM8k
27
+ |Tasks|Version| Filter |n-shot| Metric |Value | |Stderr|
28
+ |-----|------:|----------|-----:|-----------|-----:|---|-----:|
29
+ |gsm8k| 2|get-answer| 5|exact_match|0.5224|± |0.0138|
30
 
31
  ## Example
32
  ```