SnifferCaptain commited on
Commit
5c6838f
·
verified ·
1 Parent(s): 19d9458

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -1
README.md CHANGED
@@ -50,7 +50,73 @@ YModel2 is the most powerful Large Language Model (LLM) trained by SnifferCaptai
50
  - For Supervised Fine-Tuning (SFT), the model was trained at the following sequence length and learning rate combinations: 512/1e-5, 1024/3e-6, 2048/1e-6, and 2048/5e-7 (Length/LR). This stage was also accelerated with bf16 AMP.
51
 
52
  ## 模型性能 Model Performance
53
- 模型没有经过任何跑分。模型在训练集上最终ppl约为3.0。
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
  以下是模型的问答输出:
55
 
56
  ---
 
50
  - For Supervised Fine-Tuning (SFT), the model was trained at the following sequence length and learning rate combinations: 512/1e-5, 1024/3e-6, 2048/1e-6, and 2048/5e-7 (Length/LR). This stage was also accelerated with bf16 AMP.
51
 
52
  ## 模型性能 Model Performance
53
+ 模型在多个数据集上跑分,仅供娱乐参考:
54
+
55
+ 模型跑分结果如下,使用lm_eval框架:
56
+ | Groups |Version|Filter|n-shot| Metric | |Value | |Stderr|
57
+ |-----------|------:|------|------|--------|---|-----:|---|-----:|
58
+ |ceval-valid| 2|none | 0|acc |↑ |0.2303|± |0.0115|
59
+ <details style="color:rgb(128,128,128)">
60
+ <summary>ceval bench result</summary>
61
+
62
+ | Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
63
+ |----------------------------------------------------|------:|------|-----:|--------|---|-----:|---|-----:|
64
+ |ceval-valid | 2|none | |acc |↑ |0.2303|± |0.0115|
65
+ |ceval-valid_accountant | 2|none | 0|acc |↑ |0.2245|± |0.0602|
66
+ |ceval-valid_advanced_mathematics | 2|none | 0|acc |↑ |0.3158|± |0.1096|
67
+ |ceval-valid_art_studies | 2|none | 0|acc |↑ |0.4545|± |0.0880|
68
+ |ceval-valid_basic_medicine | 2|none | 0|acc |↑ |0.0526|± |0.0526|
69
+ |ceval-valid_business_administration | 2|none | 0|acc |↑ |0.2424|± |0.0758|
70
+ |ceval-valid_chinese_language_and_literature | 2|none | 0|acc |↑ |0.2174|± |0.0879|
71
+ |ceval-valid_civil_servant | 2|none | 0|acc |↑ |0.2553|± |0.0643|
72
+ |ceval-valid_clinical_medicine | 2|none | 0|acc |↑ |0.2273|± |0.0914|
73
+ |ceval-valid_college_chemistry | 2|none | 0|acc |↑ |0.1667|± |0.0777|
74
+ |ceval-valid_college_economics | 2|none | 0|acc |↑ |0.2909|± |0.0618|
75
+ |ceval-valid_college_physics | 2|none | 0|acc |↑ |0.2105|± |0.0961|
76
+ |ceval-valid_college_programming | 2|none | 0|acc |↑ |0.2432|± |0.0715|
77
+ |ceval-valid_computer_architecture | 2|none | 0|acc |↑ |0.2857|± |0.1010|
78
+ |ceval-valid_computer_network | 2|none | 0|acc |↑ |0.1053|± |0.0723|
79
+ |ceval-valid_discrete_mathematics | 2|none | 0|acc |↑ |0.3750|± |0.1250|
80
+ |ceval-valid_education_science | 2|none | 0|acc |↑ |0.2414|± |0.0809|
81
+ |ceval-valid_electrical_engineer | 2|none | 0|acc |↑ |0.2162|± |0.0686|
82
+ |ceval-valid_environmental_impact_assessment_engineer| 2|none | 0|acc |↑ |0.1613|± |0.0672|
83
+ |ceval-valid_fire_engineer | 2|none | 0|acc |↑ |0.2581|± |0.0799|
84
+ |ceval-valid_high_school_biology | 2|none | 0|acc |↑ |0.3684|± |0.1137|
85
+ |ceval-valid_high_school_chemistry | 2|none | 0|acc |↑ |0.2105|± |0.0961|
86
+ |ceval-valid_high_school_chinese | 2|none | 0|acc |↑ |0.2105|± |0.0961|
87
+ |ceval-valid_high_school_geography | 2|none | 0|acc |↑ |0.2105|± |0.0961|
88
+ |ceval-valid_high_school_history | 2|none | 0|acc |↑ |0.3000|± |0.1051|
89
+ |ceval-valid_high_school_mathematics | 2|none | 0|acc |↑ |0.2222|± |0.1008|
90
+ |ceval-valid_high_school_physics | 2|none | 0|acc |↑ |0.2105|± |0.0961|
91
+ |ceval-valid_high_school_politics | 2|none | 0|acc |↑ |0.2105|± |0.0961|
92
+ |ceval-valid_ideological_and_moral_cultivation | 2|none | 0|acc |↑ |0.2632|± |0.1038|
93
+ |ceval-valid_law | 2|none | 0|acc |↑ |0.2500|± |0.0903|
94
+ |ceval-valid_legal_professional | 2|none | 0|acc |↑ |0.0435|± |0.0435|
95
+ |ceval-valid_logic | 2|none | 0|acc |↑ |0.1818|± |0.0842|
96
+ |ceval-valid_mao_zedong_thought | 2|none | 0|acc |↑ |0.3333|± |0.0983|
97
+ |ceval-valid_marxism | 2|none | 0|acc |↑ |0.2632|± |0.1038|
98
+ |ceval-valid_metrology_engineer | 2|none | 0|acc |↑ |0.1250|± |0.0690|
99
+ |ceval-valid_middle_school_biology | 2|none | 0|acc |↑ |0.1905|± |0.0878|
100
+ |ceval-valid_middle_school_chemistry | 2|none | 0|acc |↑ |0.1500|± |0.0819|
101
+ |ceval-valid_middle_school_geography | 2|none | 0|acc |↑ |0.0833|± |0.0833|
102
+ |ceval-valid_middle_school_history | 2|none | 0|acc |↑ |0.1818|± |0.0842|
103
+ |ceval-valid_middle_school_mathematics | 2|none | 0|acc |↑ |0.1579|± |0.0859|
104
+ |ceval-valid_middle_school_physics | 2|none | 0|acc |↑ |0.2105|± |0.0961|
105
+ |ceval-valid_middle_school_politics | 2|none | 0|acc |↑ |0.2857|± |0.1010|
106
+ |ceval-valid_modern_chinese_history | 2|none | 0|acc |↑ |0.1739|± |0.0808|
107
+ |ceval-valid_operating_system | 2|none | 0|acc |↑ |0.1579|± |0.0859|
108
+ |ceval-valid_physician | 2|none | 0|acc |↑ |0.2653|± |0.0637|
109
+ |ceval-valid_plant_protection | 2|none | 0|acc |↑ |0.3182|± |0.1016|
110
+ |ceval-valid_probability_and_statistics | 2|none | 0|acc |↑ |0.1111|± |0.0762|
111
+ |ceval-valid_professional_tour_guide | 2|none | 0|acc |↑ |0.3448|± |0.0898|
112
+ |ceval-valid_sports_science | 2|none | 0|acc |↑ |0.1579|± |0.0859|
113
+ |ceval-valid_tax_accountant | 2|none | 0|acc |↑ |0.1633|± |0.0533|
114
+ |ceval-valid_teacher_qualification | 2|none | 0|acc |↑ |0.2955|± |0.0696|
115
+ |ceval-valid_urban_and_rural_planner | 2|none | 0|acc |↑ |0.2174|± |0.0615|
116
+ |ceval-valid_veterinary_medicine | 2|none | 0|acc |↑ |0.2174|± |0.0879|
117
+
118
+ </details>
119
+
120
  以下是模型的问答输出:
121
 
122
  ---