abarbosa commited on
Commit
03bb81c
·
verified ·
1 Parent(s): f7a2827

Pushing fine-tuned model to Hugging Face Hub

Browse files
README.md ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+ language:
4
+ - pt
5
+ - en
6
+ tags:
7
+ - aes
8
+ datasets:
9
+ - kamel-usp/aes_enem_dataset
10
+ base_model: microsoft/Phi-3.5-mini-instruct
11
+ metrics:
12
+ - accuracy
13
+ - qwk
14
+ library_name: peft
15
+ model-index:
16
+ - name: phi35-balanced-C2
17
+ results:
18
+ - task:
19
+ type: text-classification
20
+ name: Automated Essay Score
21
+ dataset:
22
+ name: Automated Essay Score ENEM Dataset
23
+ type: kamel-usp/aes_enem_dataset
24
+ config: JBCS2025
25
+ split: test
26
+ metrics:
27
+ - name: Macro F1 (ignoring nan)
28
+ type: f1
29
+ value: 0.3635310763177709
30
+ - name: QWK
31
+ type: qwk
32
+ value: 0.3441810010847668
33
+ - name: Weighted Macro F1
34
+ type: f1
35
+ value: 0.3284848932013696
36
+ ---
37
+ # Model ID: phi35-balanced-C2
38
+ ## Results
39
+ | | test_data |
40
+ |:-----------------------------|------------:|
41
+ | eval_accuracy | 0.362319 |
42
+ | eval_RMSE | 65.3197 |
43
+ | eval_QWK | 0.344181 |
44
+ | eval_Macro_F1 | 0.242354 |
45
+ | eval_Macro_F1_(ignoring_nan) | 0.363531 |
46
+ | eval_Weighted_F1 | 0.328485 |
47
+ | eval_Micro_F1 | 0.362319 |
48
+ | eval_HDIV | 0.0724638 |
49
+
adapter_config.json ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alpha_pattern": {},
3
+ "auto_mapping": null,
4
+ "base_model_name_or_path": "microsoft/Phi-3.5-mini-instruct",
5
+ "bias": "none",
6
+ "corda_config": null,
7
+ "eva_config": null,
8
+ "exclude_modules": null,
9
+ "fan_in_fan_out": false,
10
+ "inference_mode": true,
11
+ "init_lora_weights": true,
12
+ "layer_replication": null,
13
+ "layers_pattern": null,
14
+ "layers_to_transform": null,
15
+ "loftq_config": {},
16
+ "lora_alpha": 16,
17
+ "lora_bias": false,
18
+ "lora_dropout": 0.05,
19
+ "megatron_config": null,
20
+ "megatron_core": "megatron.core",
21
+ "modules_to_save": [
22
+ "classifier",
23
+ "score"
24
+ ],
25
+ "peft_type": "LORA",
26
+ "r": 8,
27
+ "rank_pattern": {},
28
+ "revision": null,
29
+ "target_modules": [
30
+ "gate_up_proj",
31
+ "down_proj",
32
+ "qkv_proj",
33
+ "o_proj"
34
+ ],
35
+ "task_type": "SEQ_CLS",
36
+ "trainable_token_indices": null,
37
+ "use_dora": false,
38
+ "use_rslora": false
39
+ }
adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:50de80c24a7858e5f5c700dc2599f9c6ce2987af8129d99367bb8e42fb6aac2d
3
+ size 50402728
run_experiment.log ADDED
@@ -0,0 +1,1884 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [2025-03-24 19:26:35,748][__main__][INFO] - cache_dir: /media/data/tmp
2
+ dataset:
3
+ name: kamel-usp/aes_enem_dataset
4
+ split: JBCS2025
5
+ training_params:
6
+ seed: 42
7
+ num_train_epochs: 20
8
+ logging_steps: 100
9
+ metric_for_best_model: QWK
10
+ bf16: true
11
+ post_training_results:
12
+ model_path: /workspace/jbcs2025/outputs/2025-03-23/23-16-55
13
+ experiments:
14
+ model:
15
+ name: microsoft/Phi-3.5-mini-instruct
16
+ type: phi35_classification_lora
17
+ num_labels: 6
18
+ output_dir: ./results/phi35-balanced/C2
19
+ logging_dir: ./logs/phi35-balanced/C2
20
+ best_model_dir: ./results/phi35-balanced/C2/best_model
21
+ lora_r: 8
22
+ lora_dropout: 0.05
23
+ lora_alpha: 16
24
+ lora_target_modules: all-linear
25
+ dataset:
26
+ grade_index: 1
27
+ training_id: phi35-balanced-C2
28
+ training_params:
29
+ weight_decay: 0.01
30
+ warmup_ratio: 0.1
31
+ learning_rate: 5.0e-05
32
+ train_batch_size: 2
33
+ eval_batch_size: 16
34
+ gradient_accumulation_steps: 8
35
+ gradient_checkpointing: false
36
+
37
+ [2025-03-24 19:26:35,750][__main__][INFO] - Starting the Fine Tuning training process.
38
+ [2025-03-24 19:26:41,879][transformers.tokenization_utils_base][INFO] - loading file tokenizer.model from cache at /media/data/tmp/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/tokenizer.model
39
+ [2025-03-24 19:26:41,879][transformers.tokenization_utils_base][INFO] - loading file tokenizer.json from cache at /media/data/tmp/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/tokenizer.json
40
+ [2025-03-24 19:26:41,879][transformers.tokenization_utils_base][INFO] - loading file added_tokens.json from cache at /media/data/tmp/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/added_tokens.json
41
+ [2025-03-24 19:26:41,879][transformers.tokenization_utils_base][INFO] - loading file special_tokens_map.json from cache at /media/data/tmp/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/special_tokens_map.json
42
+ [2025-03-24 19:26:41,879][transformers.tokenization_utils_base][INFO] - loading file tokenizer_config.json from cache at /media/data/tmp/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/tokenizer_config.json
43
+ [2025-03-24 19:26:41,879][transformers.tokenization_utils_base][INFO] - loading file chat_template.jinja from cache at None
44
+ [2025-03-24 19:26:41,954][transformers.tokenization_utils_base][INFO] - Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
45
+ [2025-03-24 19:26:41,960][__main__][INFO] - Tokenizer function parameters- Padding:longest; Truncation: False
46
+ [2025-03-24 19:26:42,733][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /media/data/tmp/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
47
+ [2025-03-24 19:26:42,734][transformers.configuration_utils][INFO] - Model config Phi3Config {
48
+ "architectures": [
49
+ "Phi3ForCausalLM"
50
+ ],
51
+ "attention_bias": false,
52
+ "attention_dropout": 0.0,
53
+ "auto_map": {
54
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
55
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
56
+ },
57
+ "bos_token_id": 1,
58
+ "embd_pdrop": 0.0,
59
+ "eos_token_id": 32000,
60
+ "hidden_act": "silu",
61
+ "hidden_size": 3072,
62
+ "id2label": {
63
+ "0": 0,
64
+ "1": 40,
65
+ "2": 80,
66
+ "3": 120,
67
+ "4": 160,
68
+ "5": 200
69
+ },
70
+ "initializer_range": 0.02,
71
+ "intermediate_size": 8192,
72
+ "label2id": {
73
+ "0": 0,
74
+ "40": 1,
75
+ "80": 2,
76
+ "120": 3,
77
+ "160": 4,
78
+ "200": 5
79
+ },
80
+ "max_position_embeddings": 131072,
81
+ "model_type": "phi3",
82
+ "num_attention_heads": 32,
83
+ "num_hidden_layers": 32,
84
+ "num_key_value_heads": 32,
85
+ "original_max_position_embeddings": 4096,
86
+ "pad_token_id": 32000,
87
+ "partial_rotary_factor": 1.0,
88
+ "resid_pdrop": 0.0,
89
+ "rms_norm_eps": 1e-05,
90
+ "rope_scaling": {
91
+ "long_factor": [
92
+ 1.0800000429153442,
93
+ 1.1100000143051147,
94
+ 1.1399999856948853,
95
+ 1.340000033378601,
96
+ 1.5899999141693115,
97
+ 1.600000023841858,
98
+ 1.6200000047683716,
99
+ 2.620000123977661,
100
+ 3.2300000190734863,
101
+ 3.2300000190734863,
102
+ 4.789999961853027,
103
+ 7.400000095367432,
104
+ 7.700000286102295,
105
+ 9.09000015258789,
106
+ 12.199999809265137,
107
+ 17.670000076293945,
108
+ 24.46000099182129,
109
+ 28.57000160217285,
110
+ 30.420001983642578,
111
+ 30.840002059936523,
112
+ 32.590003967285156,
113
+ 32.93000411987305,
114
+ 42.320003509521484,
115
+ 44.96000289916992,
116
+ 50.340003967285156,
117
+ 50.45000457763672,
118
+ 57.55000305175781,
119
+ 57.93000411987305,
120
+ 58.21000289916992,
121
+ 60.1400032043457,
122
+ 62.61000442504883,
123
+ 62.62000274658203,
124
+ 62.71000289916992,
125
+ 63.1400032043457,
126
+ 63.1400032043457,
127
+ 63.77000427246094,
128
+ 63.93000411987305,
129
+ 63.96000289916992,
130
+ 63.970001220703125,
131
+ 64.02999877929688,
132
+ 64.06999969482422,
133
+ 64.08000183105469,
134
+ 64.12000274658203,
135
+ 64.41000366210938,
136
+ 64.4800033569336,
137
+ 64.51000213623047,
138
+ 64.52999877929688,
139
+ 64.83999633789062
140
+ ],
141
+ "short_factor": [
142
+ 1.0,
143
+ 1.0199999809265137,
144
+ 1.0299999713897705,
145
+ 1.0299999713897705,
146
+ 1.0499999523162842,
147
+ 1.0499999523162842,
148
+ 1.0499999523162842,
149
+ 1.0499999523162842,
150
+ 1.0499999523162842,
151
+ 1.0699999332427979,
152
+ 1.0999999046325684,
153
+ 1.1099998950958252,
154
+ 1.1599998474121094,
155
+ 1.1599998474121094,
156
+ 1.1699998378753662,
157
+ 1.2899998426437378,
158
+ 1.339999794960022,
159
+ 1.679999828338623,
160
+ 1.7899998426437378,
161
+ 1.8199998140335083,
162
+ 1.8499997854232788,
163
+ 1.8799997568130493,
164
+ 1.9099997282028198,
165
+ 1.9399996995925903,
166
+ 1.9899996519088745,
167
+ 2.0199997425079346,
168
+ 2.0199997425079346,
169
+ 2.0199997425079346,
170
+ 2.0199997425079346,
171
+ 2.0199997425079346,
172
+ 2.0199997425079346,
173
+ 2.0299997329711914,
174
+ 2.0299997329711914,
175
+ 2.0299997329711914,
176
+ 2.0299997329711914,
177
+ 2.0299997329711914,
178
+ 2.0299997329711914,
179
+ 2.0299997329711914,
180
+ 2.0299997329711914,
181
+ 2.0299997329711914,
182
+ 2.0799996852874756,
183
+ 2.0899996757507324,
184
+ 2.189999580383301,
185
+ 2.2199995517730713,
186
+ 2.5899994373321533,
187
+ 2.729999542236328,
188
+ 2.749999523162842,
189
+ 2.8399994373321533
190
+ ],
191
+ "type": "longrope"
192
+ },
193
+ "rope_theta": 10000.0,
194
+ "sliding_window": 262144,
195
+ "tie_word_embeddings": false,
196
+ "torch_dtype": "bfloat16",
197
+ "transformers_version": "4.50.0",
198
+ "use_cache": true,
199
+ "vocab_size": 32064
200
+ }
201
+
202
+ [2025-03-24 19:26:42,735][transformers.modeling_utils][INFO] - loading weights file model.safetensors from cache at /media/data/tmp/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/model.safetensors.index.json
203
+ [2025-03-24 19:26:42,735][transformers.modeling_utils][INFO] - Will use torch_dtype=torch.bfloat16 as defined in model's config object
204
+ [2025-03-24 19:26:42,735][transformers.modeling_utils][INFO] - Instantiating Phi3ForSequenceClassification model under default dtype torch.bfloat16.
205
+ [2025-03-24 19:27:09,890][transformers.modeling_utils][INFO] - Some weights of the model checkpoint at microsoft/Phi-3.5-mini-instruct were not used when initializing Phi3ForSequenceClassification: ['lm_head.weight']
206
+ - This IS expected if you are initializing Phi3ForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
207
+ - This IS NOT expected if you are initializing Phi3ForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
208
+ [2025-03-24 19:27:09,891][transformers.modeling_utils][WARNING] - Some weights of Phi3ForSequenceClassification were not initialized from the model checkpoint at microsoft/Phi-3.5-mini-instruct and are newly initialized: ['score.weight']
209
+ You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
210
+ [2025-03-24 19:27:39,707][__main__][INFO] - None
211
+ [2025-03-24 19:27:39,709][transformers.training_args][INFO] - PyTorch: setting up devices
212
+ [2025-03-24 19:27:39,749][__main__][INFO] - Total steps: 620. Number of warmup steps: 62
213
+ [2025-03-24 19:27:39,758][transformers.trainer][INFO] - You have loaded a model on multiple GPUs. `is_model_parallel` attribute will be force-set to `True` to avoid any unexpected behavior such as device placement mismatching.
214
+ [2025-03-24 19:27:39,849][transformers.trainer][INFO] - Using auto half precision backend
215
+ [2025-03-24 19:27:39,850][transformers.trainer][WARNING] - No label_names provided for model class `PeftModelForSequenceClassification`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.
216
+ [2025-03-24 19:27:39,857][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades. If id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
217
+ [2025-03-24 19:27:39,868][transformers.trainer][INFO] -
218
+ ***** Running Evaluation *****
219
+ [2025-03-24 19:27:39,869][transformers.trainer][INFO] - Num examples = 132
220
+ [2025-03-24 19:27:39,869][transformers.trainer][INFO] - Batch size = 16
221
+ [2025-03-24 19:27:58,784][transformers][INFO] - {'accuracy': 0.24242424242424243, 'RMSE': 66.332495807108, 'QWK': -0.03813155386082001, 'HDIV': 0.20454545454545459, 'Macro_F1': 0.10504184527454583, 'Micro_F1': 0.24242424242424243, 'Weighted_F1': 0.1534017455634112, 'Macro_F1_(ignoring_nan)': np.float64(0.17506974212424306)}
222
+ [2025-03-24 19:27:58,788][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
223
+ [2025-03-24 19:27:59,088][transformers.trainer][INFO] - The following columns in the training set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades. If id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
224
+ [2025-03-24 19:27:59,114][transformers.trainer][INFO] - ***** Running training *****
225
+ [2025-03-24 19:27:59,114][transformers.trainer][INFO] - Num examples = 500
226
+ [2025-03-24 19:27:59,114][transformers.trainer][INFO] - Num Epochs = 20
227
+ [2025-03-24 19:27:59,114][transformers.trainer][INFO] - Instantaneous batch size per device = 2
228
+ [2025-03-24 19:27:59,114][transformers.trainer][INFO] - Total train batch size (w. parallel, distributed & accumulation) = 16
229
+ [2025-03-24 19:27:59,114][transformers.trainer][INFO] - Gradient Accumulation steps = 8
230
+ [2025-03-24 19:27:59,114][transformers.trainer][INFO] - Total optimization steps = 620
231
+ [2025-03-24 19:27:59,116][transformers.trainer][INFO] - Number of trainable parameters = 12,601,344
232
+ [2025-03-24 19:33:57,142][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades. If id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
233
+ [2025-03-24 19:33:57,144][transformers.trainer][INFO] -
234
+ ***** Running Evaluation *****
235
+ [2025-03-24 19:33:57,144][transformers.trainer][INFO] - Num examples = 132
236
+ [2025-03-24 19:33:57,144][transformers.trainer][INFO] - Batch size = 16
237
+ [2025-03-24 19:34:15,625][transformers][INFO] - {'accuracy': 0.4621212121212121, 'RMSE': 49.11335065052284, 'QWK': 0.04061358655953251, 'HDIV': 0.007575757575757569, 'Macro_F1': 0.1404692650765949, 'Micro_F1': 0.4621212121212121, 'Weighted_F1': 0.3043671150128393, 'Macro_F1_(ignoring_nan)': np.float64(0.3511731626914873)}
238
+ [2025-03-24 19:34:15,626][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
239
+ [2025-03-24 19:34:15,629][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-32
240
+ [2025-03-24 19:34:16,447][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
241
+ [2025-03-24 19:34:16,449][transformers.configuration_utils][INFO] - Model config Phi3Config {
242
+ "architectures": [
243
+ "Phi3ForCausalLM"
244
+ ],
245
+ "attention_bias": false,
246
+ "attention_dropout": 0.0,
247
+ "auto_map": {
248
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
249
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
250
+ },
251
+ "bos_token_id": 1,
252
+ "embd_pdrop": 0.0,
253
+ "eos_token_id": 32000,
254
+ "hidden_act": "silu",
255
+ "hidden_size": 3072,
256
+ "initializer_range": 0.02,
257
+ "intermediate_size": 8192,
258
+ "max_position_embeddings": 131072,
259
+ "model_type": "phi3",
260
+ "num_attention_heads": 32,
261
+ "num_hidden_layers": 32,
262
+ "num_key_value_heads": 32,
263
+ "original_max_position_embeddings": 4096,
264
+ "pad_token_id": 32000,
265
+ "partial_rotary_factor": 1.0,
266
+ "resid_pdrop": 0.0,
267
+ "rms_norm_eps": 1e-05,
268
+ "rope_scaling": {
269
+ "long_factor": [
270
+ 1.0800000429153442,
271
+ 1.1100000143051147,
272
+ 1.1399999856948853,
273
+ 1.340000033378601,
274
+ 1.5899999141693115,
275
+ 1.600000023841858,
276
+ 1.6200000047683716,
277
+ 2.620000123977661,
278
+ 3.2300000190734863,
279
+ 3.2300000190734863,
280
+ 4.789999961853027,
281
+ 7.400000095367432,
282
+ 7.700000286102295,
283
+ 9.09000015258789,
284
+ 12.199999809265137,
285
+ 17.670000076293945,
286
+ 24.46000099182129,
287
+ 28.57000160217285,
288
+ 30.420001983642578,
289
+ 30.840002059936523,
290
+ 32.590003967285156,
291
+ 32.93000411987305,
292
+ 42.320003509521484,
293
+ 44.96000289916992,
294
+ 50.340003967285156,
295
+ 50.45000457763672,
296
+ 57.55000305175781,
297
+ 57.93000411987305,
298
+ 58.21000289916992,
299
+ 60.1400032043457,
300
+ 62.61000442504883,
301
+ 62.62000274658203,
302
+ 62.71000289916992,
303
+ 63.1400032043457,
304
+ 63.1400032043457,
305
+ 63.77000427246094,
306
+ 63.93000411987305,
307
+ 63.96000289916992,
308
+ 63.970001220703125,
309
+ 64.02999877929688,
310
+ 64.06999969482422,
311
+ 64.08000183105469,
312
+ 64.12000274658203,
313
+ 64.41000366210938,
314
+ 64.4800033569336,
315
+ 64.51000213623047,
316
+ 64.52999877929688,
317
+ 64.83999633789062
318
+ ],
319
+ "short_factor": [
320
+ 1.0,
321
+ 1.0199999809265137,
322
+ 1.0299999713897705,
323
+ 1.0299999713897705,
324
+ 1.0499999523162842,
325
+ 1.0499999523162842,
326
+ 1.0499999523162842,
327
+ 1.0499999523162842,
328
+ 1.0499999523162842,
329
+ 1.0699999332427979,
330
+ 1.0999999046325684,
331
+ 1.1099998950958252,
332
+ 1.1599998474121094,
333
+ 1.1599998474121094,
334
+ 1.1699998378753662,
335
+ 1.2899998426437378,
336
+ 1.339999794960022,
337
+ 1.679999828338623,
338
+ 1.7899998426437378,
339
+ 1.8199998140335083,
340
+ 1.8499997854232788,
341
+ 1.8799997568130493,
342
+ 1.9099997282028198,
343
+ 1.9399996995925903,
344
+ 1.9899996519088745,
345
+ 2.0199997425079346,
346
+ 2.0199997425079346,
347
+ 2.0199997425079346,
348
+ 2.0199997425079346,
349
+ 2.0199997425079346,
350
+ 2.0199997425079346,
351
+ 2.0299997329711914,
352
+ 2.0299997329711914,
353
+ 2.0299997329711914,
354
+ 2.0299997329711914,
355
+ 2.0299997329711914,
356
+ 2.0299997329711914,
357
+ 2.0299997329711914,
358
+ 2.0299997329711914,
359
+ 2.0299997329711914,
360
+ 2.0799996852874756,
361
+ 2.0899996757507324,
362
+ 2.189999580383301,
363
+ 2.2199995517730713,
364
+ 2.5899994373321533,
365
+ 2.729999542236328,
366
+ 2.749999523162842,
367
+ 2.8399994373321533
368
+ ],
369
+ "type": "longrope"
370
+ },
371
+ "rope_theta": 10000.0,
372
+ "sliding_window": 262144,
373
+ "tie_word_embeddings": false,
374
+ "torch_dtype": "bfloat16",
375
+ "transformers_version": "4.50.0",
376
+ "use_cache": true,
377
+ "vocab_size": 32064
378
+ }
379
+
380
+ [2025-03-24 19:40:22,418][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades. If id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
381
+ [2025-03-24 19:40:22,421][transformers.trainer][INFO] -
382
+ ***** Running Evaluation *****
383
+ [2025-03-24 19:40:22,421][transformers.trainer][INFO] - Num examples = 132
384
+ [2025-03-24 19:40:22,421][transformers.trainer][INFO] - Batch size = 16
385
+ [2025-03-24 19:40:40,827][transformers][INFO] - {'accuracy': 0.4393939393939394, 'RMSE': 53.143601912576685, 'QWK': 0.2852760736196319, 'HDIV': 0.045454545454545414, 'Macro_F1': 0.2148417214126595, 'Micro_F1': 0.4393939393939394, 'Weighted_F1': 0.3616045380805515, 'Macro_F1_(ignoring_nan)': np.float64(0.26855215176582437)}
386
+ [2025-03-24 19:40:40,828][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
387
+ [2025-03-24 19:40:40,830][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-64
388
+ [2025-03-24 19:40:41,379][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
389
+ [2025-03-24 19:40:41,381][transformers.configuration_utils][INFO] - Model config Phi3Config {
390
+ "architectures": [
391
+ "Phi3ForCausalLM"
392
+ ],
393
+ "attention_bias": false,
394
+ "attention_dropout": 0.0,
395
+ "auto_map": {
396
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
397
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
398
+ },
399
+ "bos_token_id": 1,
400
+ "embd_pdrop": 0.0,
401
+ "eos_token_id": 32000,
402
+ "hidden_act": "silu",
403
+ "hidden_size": 3072,
404
+ "initializer_range": 0.02,
405
+ "intermediate_size": 8192,
406
+ "max_position_embeddings": 131072,
407
+ "model_type": "phi3",
408
+ "num_attention_heads": 32,
409
+ "num_hidden_layers": 32,
410
+ "num_key_value_heads": 32,
411
+ "original_max_position_embeddings": 4096,
412
+ "pad_token_id": 32000,
413
+ "partial_rotary_factor": 1.0,
414
+ "resid_pdrop": 0.0,
415
+ "rms_norm_eps": 1e-05,
416
+ "rope_scaling": {
417
+ "long_factor": [
418
+ 1.0800000429153442,
419
+ 1.1100000143051147,
420
+ 1.1399999856948853,
421
+ 1.340000033378601,
422
+ 1.5899999141693115,
423
+ 1.600000023841858,
424
+ 1.6200000047683716,
425
+ 2.620000123977661,
426
+ 3.2300000190734863,
427
+ 3.2300000190734863,
428
+ 4.789999961853027,
429
+ 7.400000095367432,
430
+ 7.700000286102295,
431
+ 9.09000015258789,
432
+ 12.199999809265137,
433
+ 17.670000076293945,
434
+ 24.46000099182129,
435
+ 28.57000160217285,
436
+ 30.420001983642578,
437
+ 30.840002059936523,
438
+ 32.590003967285156,
439
+ 32.93000411987305,
440
+ 42.320003509521484,
441
+ 44.96000289916992,
442
+ 50.340003967285156,
443
+ 50.45000457763672,
444
+ 57.55000305175781,
445
+ 57.93000411987305,
446
+ 58.21000289916992,
447
+ 60.1400032043457,
448
+ 62.61000442504883,
449
+ 62.62000274658203,
450
+ 62.71000289916992,
451
+ 63.1400032043457,
452
+ 63.1400032043457,
453
+ 63.77000427246094,
454
+ 63.93000411987305,
455
+ 63.96000289916992,
456
+ 63.970001220703125,
457
+ 64.02999877929688,
458
+ 64.06999969482422,
459
+ 64.08000183105469,
460
+ 64.12000274658203,
461
+ 64.41000366210938,
462
+ 64.4800033569336,
463
+ 64.51000213623047,
464
+ 64.52999877929688,
465
+ 64.83999633789062
466
+ ],
467
+ "short_factor": [
468
+ 1.0,
469
+ 1.0199999809265137,
470
+ 1.0299999713897705,
471
+ 1.0299999713897705,
472
+ 1.0499999523162842,
473
+ 1.0499999523162842,
474
+ 1.0499999523162842,
475
+ 1.0499999523162842,
476
+ 1.0499999523162842,
477
+ 1.0699999332427979,
478
+ 1.0999999046325684,
479
+ 1.1099998950958252,
480
+ 1.1599998474121094,
481
+ 1.1599998474121094,
482
+ 1.1699998378753662,
483
+ 1.2899998426437378,
484
+ 1.339999794960022,
485
+ 1.679999828338623,
486
+ 1.7899998426437378,
487
+ 1.8199998140335083,
488
+ 1.8499997854232788,
489
+ 1.8799997568130493,
490
+ 1.9099997282028198,
491
+ 1.9399996995925903,
492
+ 1.9899996519088745,
493
+ 2.0199997425079346,
494
+ 2.0199997425079346,
495
+ 2.0199997425079346,
496
+ 2.0199997425079346,
497
+ 2.0199997425079346,
498
+ 2.0199997425079346,
499
+ 2.0299997329711914,
500
+ 2.0299997329711914,
501
+ 2.0299997329711914,
502
+ 2.0299997329711914,
503
+ 2.0299997329711914,
504
+ 2.0299997329711914,
505
+ 2.0299997329711914,
506
+ 2.0299997329711914,
507
+ 2.0299997329711914,
508
+ 2.0799996852874756,
509
+ 2.0899996757507324,
510
+ 2.189999580383301,
511
+ 2.2199995517730713,
512
+ 2.5899994373321533,
513
+ 2.729999542236328,
514
+ 2.749999523162842,
515
+ 2.8399994373321533
516
+ ],
517
+ "type": "longrope"
518
+ },
519
+ "rope_theta": 10000.0,
520
+ "sliding_window": 262144,
521
+ "tie_word_embeddings": false,
522
+ "torch_dtype": "bfloat16",
523
+ "transformers_version": "4.50.0",
524
+ "use_cache": true,
525
+ "vocab_size": 32064
526
+ }
527
+
528
+ [2025-03-24 19:40:49,514][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-32] due to args.save_total_limit
529
+ [2025-03-24 19:46:47,318][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades. If id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
530
+ [2025-03-24 19:46:47,320][transformers.trainer][INFO] -
531
+ ***** Running Evaluation *****
532
+ [2025-03-24 19:46:47,320][transformers.trainer][INFO] - Num examples = 132
533
+ [2025-03-24 19:46:47,320][transformers.trainer][INFO] - Batch size = 16
534
+ [2025-03-24 19:47:05,629][transformers][INFO] - {'accuracy': 0.4318181818181818, 'RMSE': 52.68545886444927, 'QWK': 0.33076514346439956, 'HDIV': 0.045454545454545414, 'Macro_F1': 0.2661131957473421, 'Micro_F1': 0.4318181818181818, 'Weighted_F1': 0.39607733052855004, 'Macro_F1_(ignoring_nan)': np.float64(0.4435219929122369)}
535
+ [2025-03-24 19:47:05,630][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
536
+ [2025-03-24 19:47:05,632][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-96
537
+ [2025-03-24 19:47:06,539][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
538
+ [2025-03-24 19:47:06,540][transformers.configuration_utils][INFO] - Model config Phi3Config {
539
+ "architectures": [
540
+ "Phi3ForCausalLM"
541
+ ],
542
+ "attention_bias": false,
543
+ "attention_dropout": 0.0,
544
+ "auto_map": {
545
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
546
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
547
+ },
548
+ "bos_token_id": 1,
549
+ "embd_pdrop": 0.0,
550
+ "eos_token_id": 32000,
551
+ "hidden_act": "silu",
552
+ "hidden_size": 3072,
553
+ "initializer_range": 0.02,
554
+ "intermediate_size": 8192,
555
+ "max_position_embeddings": 131072,
556
+ "model_type": "phi3",
557
+ "num_attention_heads": 32,
558
+ "num_hidden_layers": 32,
559
+ "num_key_value_heads": 32,
560
+ "original_max_position_embeddings": 4096,
561
+ "pad_token_id": 32000,
562
+ "partial_rotary_factor": 1.0,
563
+ "resid_pdrop": 0.0,
564
+ "rms_norm_eps": 1e-05,
565
+ "rope_scaling": {
566
+ "long_factor": [
567
+ 1.0800000429153442,
568
+ 1.1100000143051147,
569
+ 1.1399999856948853,
570
+ 1.340000033378601,
571
+ 1.5899999141693115,
572
+ 1.600000023841858,
573
+ 1.6200000047683716,
574
+ 2.620000123977661,
575
+ 3.2300000190734863,
576
+ 3.2300000190734863,
577
+ 4.789999961853027,
578
+ 7.400000095367432,
579
+ 7.700000286102295,
580
+ 9.09000015258789,
581
+ 12.199999809265137,
582
+ 17.670000076293945,
583
+ 24.46000099182129,
584
+ 28.57000160217285,
585
+ 30.420001983642578,
586
+ 30.840002059936523,
587
+ 32.590003967285156,
588
+ 32.93000411987305,
589
+ 42.320003509521484,
590
+ 44.96000289916992,
591
+ 50.340003967285156,
592
+ 50.45000457763672,
593
+ 57.55000305175781,
594
+ 57.93000411987305,
595
+ 58.21000289916992,
596
+ 60.1400032043457,
597
+ 62.61000442504883,
598
+ 62.62000274658203,
599
+ 62.71000289916992,
600
+ 63.1400032043457,
601
+ 63.1400032043457,
602
+ 63.77000427246094,
603
+ 63.93000411987305,
604
+ 63.96000289916992,
605
+ 63.970001220703125,
606
+ 64.02999877929688,
607
+ 64.06999969482422,
608
+ 64.08000183105469,
609
+ 64.12000274658203,
610
+ 64.41000366210938,
611
+ 64.4800033569336,
612
+ 64.51000213623047,
613
+ 64.52999877929688,
614
+ 64.83999633789062
615
+ ],
616
+ "short_factor": [
617
+ 1.0,
618
+ 1.0199999809265137,
619
+ 1.0299999713897705,
620
+ 1.0299999713897705,
621
+ 1.0499999523162842,
622
+ 1.0499999523162842,
623
+ 1.0499999523162842,
624
+ 1.0499999523162842,
625
+ 1.0499999523162842,
626
+ 1.0699999332427979,
627
+ 1.0999999046325684,
628
+ 1.1099998950958252,
629
+ 1.1599998474121094,
630
+ 1.1599998474121094,
631
+ 1.1699998378753662,
632
+ 1.2899998426437378,
633
+ 1.339999794960022,
634
+ 1.679999828338623,
635
+ 1.7899998426437378,
636
+ 1.8199998140335083,
637
+ 1.8499997854232788,
638
+ 1.8799997568130493,
639
+ 1.9099997282028198,
640
+ 1.9399996995925903,
641
+ 1.9899996519088745,
642
+ 2.0199997425079346,
643
+ 2.0199997425079346,
644
+ 2.0199997425079346,
645
+ 2.0199997425079346,
646
+ 2.0199997425079346,
647
+ 2.0199997425079346,
648
+ 2.0299997329711914,
649
+ 2.0299997329711914,
650
+ 2.0299997329711914,
651
+ 2.0299997329711914,
652
+ 2.0299997329711914,
653
+ 2.0299997329711914,
654
+ 2.0299997329711914,
655
+ 2.0299997329711914,
656
+ 2.0299997329711914,
657
+ 2.0799996852874756,
658
+ 2.0899996757507324,
659
+ 2.189999580383301,
660
+ 2.2199995517730713,
661
+ 2.5899994373321533,
662
+ 2.729999542236328,
663
+ 2.749999523162842,
664
+ 2.8399994373321533
665
+ ],
666
+ "type": "longrope"
667
+ },
668
+ "rope_theta": 10000.0,
669
+ "sliding_window": 262144,
670
+ "tie_word_embeddings": false,
671
+ "torch_dtype": "bfloat16",
672
+ "transformers_version": "4.50.0",
673
+ "use_cache": true,
674
+ "vocab_size": 32064
675
+ }
676
+
677
+ [2025-03-24 19:47:15,012][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-64] due to args.save_total_limit
678
+ [2025-03-24 19:53:13,045][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades. If id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
679
+ [2025-03-24 19:53:13,048][transformers.trainer][INFO] -
680
+ ***** Running Evaluation *****
681
+ [2025-03-24 19:53:13,048][transformers.trainer][INFO] - Num examples = 132
682
+ [2025-03-24 19:53:13,048][transformers.trainer][INFO] - Batch size = 16
683
+ [2025-03-24 19:53:31,528][transformers][INFO] - {'accuracy': 0.49242424242424243, 'RMSE': 47.60952285695233, 'QWK': 0.12331297059241364, 'HDIV': 0.022727272727272707, 'Macro_F1': 0.18550724637681157, 'Micro_F1': 0.49242424242424243, 'Weighted_F1': 0.36736934563021517, 'Macro_F1_(ignoring_nan)': np.float64(0.46376811594202894)}
684
+ [2025-03-24 19:53:31,528][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
685
+ [2025-03-24 19:53:31,531][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-128
686
+ [2025-03-24 19:53:31,974][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
687
+ [2025-03-24 19:53:31,976][transformers.configuration_utils][INFO] - Model config Phi3Config {
688
+ "architectures": [
689
+ "Phi3ForCausalLM"
690
+ ],
691
+ "attention_bias": false,
692
+ "attention_dropout": 0.0,
693
+ "auto_map": {
694
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
695
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
696
+ },
697
+ "bos_token_id": 1,
698
+ "embd_pdrop": 0.0,
699
+ "eos_token_id": 32000,
700
+ "hidden_act": "silu",
701
+ "hidden_size": 3072,
702
+ "initializer_range": 0.02,
703
+ "intermediate_size": 8192,
704
+ "max_position_embeddings": 131072,
705
+ "model_type": "phi3",
706
+ "num_attention_heads": 32,
707
+ "num_hidden_layers": 32,
708
+ "num_key_value_heads": 32,
709
+ "original_max_position_embeddings": 4096,
710
+ "pad_token_id": 32000,
711
+ "partial_rotary_factor": 1.0,
712
+ "resid_pdrop": 0.0,
713
+ "rms_norm_eps": 1e-05,
714
+ "rope_scaling": {
715
+ "long_factor": [
716
+ 1.0800000429153442,
717
+ 1.1100000143051147,
718
+ 1.1399999856948853,
719
+ 1.340000033378601,
720
+ 1.5899999141693115,
721
+ 1.600000023841858,
722
+ 1.6200000047683716,
723
+ 2.620000123977661,
724
+ 3.2300000190734863,
725
+ 3.2300000190734863,
726
+ 4.789999961853027,
727
+ 7.400000095367432,
728
+ 7.700000286102295,
729
+ 9.09000015258789,
730
+ 12.199999809265137,
731
+ 17.670000076293945,
732
+ 24.46000099182129,
733
+ 28.57000160217285,
734
+ 30.420001983642578,
735
+ 30.840002059936523,
736
+ 32.590003967285156,
737
+ 32.93000411987305,
738
+ 42.320003509521484,
739
+ 44.96000289916992,
740
+ 50.340003967285156,
741
+ 50.45000457763672,
742
+ 57.55000305175781,
743
+ 57.93000411987305,
744
+ 58.21000289916992,
745
+ 60.1400032043457,
746
+ 62.61000442504883,
747
+ 62.62000274658203,
748
+ 62.71000289916992,
749
+ 63.1400032043457,
750
+ 63.1400032043457,
751
+ 63.77000427246094,
752
+ 63.93000411987305,
753
+ 63.96000289916992,
754
+ 63.970001220703125,
755
+ 64.02999877929688,
756
+ 64.06999969482422,
757
+ 64.08000183105469,
758
+ 64.12000274658203,
759
+ 64.41000366210938,
760
+ 64.4800033569336,
761
+ 64.51000213623047,
762
+ 64.52999877929688,
763
+ 64.83999633789062
764
+ ],
765
+ "short_factor": [
766
+ 1.0,
767
+ 1.0199999809265137,
768
+ 1.0299999713897705,
769
+ 1.0299999713897705,
770
+ 1.0499999523162842,
771
+ 1.0499999523162842,
772
+ 1.0499999523162842,
773
+ 1.0499999523162842,
774
+ 1.0499999523162842,
775
+ 1.0699999332427979,
776
+ 1.0999999046325684,
777
+ 1.1099998950958252,
778
+ 1.1599998474121094,
779
+ 1.1599998474121094,
780
+ 1.1699998378753662,
781
+ 1.2899998426437378,
782
+ 1.339999794960022,
783
+ 1.679999828338623,
784
+ 1.7899998426437378,
785
+ 1.8199998140335083,
786
+ 1.8499997854232788,
787
+ 1.8799997568130493,
788
+ 1.9099997282028198,
789
+ 1.9399996995925903,
790
+ 1.9899996519088745,
791
+ 2.0199997425079346,
792
+ 2.0199997425079346,
793
+ 2.0199997425079346,
794
+ 2.0199997425079346,
795
+ 2.0199997425079346,
796
+ 2.0199997425079346,
797
+ 2.0299997329711914,
798
+ 2.0299997329711914,
799
+ 2.0299997329711914,
800
+ 2.0299997329711914,
801
+ 2.0299997329711914,
802
+ 2.0299997329711914,
803
+ 2.0299997329711914,
804
+ 2.0299997329711914,
805
+ 2.0299997329711914,
806
+ 2.0799996852874756,
807
+ 2.0899996757507324,
808
+ 2.189999580383301,
809
+ 2.2199995517730713,
810
+ 2.5899994373321533,
811
+ 2.729999542236328,
812
+ 2.749999523162842,
813
+ 2.8399994373321533
814
+ ],
815
+ "type": "longrope"
816
+ },
817
+ "rope_theta": 10000.0,
818
+ "sliding_window": 262144,
819
+ "tie_word_embeddings": false,
820
+ "torch_dtype": "bfloat16",
821
+ "transformers_version": "4.50.0",
822
+ "use_cache": true,
823
+ "vocab_size": 32064
824
+ }
825
+
826
+ [2025-03-24 19:59:37,807][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades. If id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
827
+ [2025-03-24 19:59:37,809][transformers.trainer][INFO] -
828
+ ***** Running Evaluation *****
829
+ [2025-03-24 19:59:37,809][transformers.trainer][INFO] - Num examples = 132
830
+ [2025-03-24 19:59:37,809][transformers.trainer][INFO] - Batch size = 16
831
+ [2025-03-24 19:59:56,127][transformers][INFO] - {'accuracy': 0.38636363636363635, 'RMSE': 58.981250230796896, 'QWK': 0.35724465558194773, 'HDIV': 0.037878787878787845, 'Macro_F1': 0.21283275639401986, 'Micro_F1': 0.38636363636363635, 'Weighted_F1': 0.33311875189695395, 'Macro_F1_(ignoring_nan)': np.float64(0.3547212606566998)}
832
+ [2025-03-24 19:59:56,128][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
833
+ [2025-03-24 19:59:56,131][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-160
834
+ [2025-03-24 19:59:57,072][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
835
+ [2025-03-24 19:59:57,073][transformers.configuration_utils][INFO] - Model config Phi3Config {
836
+ "architectures": [
837
+ "Phi3ForCausalLM"
838
+ ],
839
+ "attention_bias": false,
840
+ "attention_dropout": 0.0,
841
+ "auto_map": {
842
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
843
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
844
+ },
845
+ "bos_token_id": 1,
846
+ "embd_pdrop": 0.0,
847
+ "eos_token_id": 32000,
848
+ "hidden_act": "silu",
849
+ "hidden_size": 3072,
850
+ "initializer_range": 0.02,
851
+ "intermediate_size": 8192,
852
+ "max_position_embeddings": 131072,
853
+ "model_type": "phi3",
854
+ "num_attention_heads": 32,
855
+ "num_hidden_layers": 32,
856
+ "num_key_value_heads": 32,
857
+ "original_max_position_embeddings": 4096,
858
+ "pad_token_id": 32000,
859
+ "partial_rotary_factor": 1.0,
860
+ "resid_pdrop": 0.0,
861
+ "rms_norm_eps": 1e-05,
862
+ "rope_scaling": {
863
+ "long_factor": [
864
+ 1.0800000429153442,
865
+ 1.1100000143051147,
866
+ 1.1399999856948853,
867
+ 1.340000033378601,
868
+ 1.5899999141693115,
869
+ 1.600000023841858,
870
+ 1.6200000047683716,
871
+ 2.620000123977661,
872
+ 3.2300000190734863,
873
+ 3.2300000190734863,
874
+ 4.789999961853027,
875
+ 7.400000095367432,
876
+ 7.700000286102295,
877
+ 9.09000015258789,
878
+ 12.199999809265137,
879
+ 17.670000076293945,
880
+ 24.46000099182129,
881
+ 28.57000160217285,
882
+ 30.420001983642578,
883
+ 30.840002059936523,
884
+ 32.590003967285156,
885
+ 32.93000411987305,
886
+ 42.320003509521484,
887
+ 44.96000289916992,
888
+ 50.340003967285156,
889
+ 50.45000457763672,
890
+ 57.55000305175781,
891
+ 57.93000411987305,
892
+ 58.21000289916992,
893
+ 60.1400032043457,
894
+ 62.61000442504883,
895
+ 62.62000274658203,
896
+ 62.71000289916992,
897
+ 63.1400032043457,
898
+ 63.1400032043457,
899
+ 63.77000427246094,
900
+ 63.93000411987305,
901
+ 63.96000289916992,
902
+ 63.970001220703125,
903
+ 64.02999877929688,
904
+ 64.06999969482422,
905
+ 64.08000183105469,
906
+ 64.12000274658203,
907
+ 64.41000366210938,
908
+ 64.4800033569336,
909
+ 64.51000213623047,
910
+ 64.52999877929688,
911
+ 64.83999633789062
912
+ ],
913
+ "short_factor": [
914
+ 1.0,
915
+ 1.0199999809265137,
916
+ 1.0299999713897705,
917
+ 1.0299999713897705,
918
+ 1.0499999523162842,
919
+ 1.0499999523162842,
920
+ 1.0499999523162842,
921
+ 1.0499999523162842,
922
+ 1.0499999523162842,
923
+ 1.0699999332427979,
924
+ 1.0999999046325684,
925
+ 1.1099998950958252,
926
+ 1.1599998474121094,
927
+ 1.1599998474121094,
928
+ 1.1699998378753662,
929
+ 1.2899998426437378,
930
+ 1.339999794960022,
931
+ 1.679999828338623,
932
+ 1.7899998426437378,
933
+ 1.8199998140335083,
934
+ 1.8499997854232788,
935
+ 1.8799997568130493,
936
+ 1.9099997282028198,
937
+ 1.9399996995925903,
938
+ 1.9899996519088745,
939
+ 2.0199997425079346,
940
+ 2.0199997425079346,
941
+ 2.0199997425079346,
942
+ 2.0199997425079346,
943
+ 2.0199997425079346,
944
+ 2.0199997425079346,
945
+ 2.0299997329711914,
946
+ 2.0299997329711914,
947
+ 2.0299997329711914,
948
+ 2.0299997329711914,
949
+ 2.0299997329711914,
950
+ 2.0299997329711914,
951
+ 2.0299997329711914,
952
+ 2.0299997329711914,
953
+ 2.0299997329711914,
954
+ 2.0799996852874756,
955
+ 2.0899996757507324,
956
+ 2.189999580383301,
957
+ 2.2199995517730713,
958
+ 2.5899994373321533,
959
+ 2.729999542236328,
960
+ 2.749999523162842,
961
+ 2.8399994373321533
962
+ ],
963
+ "type": "longrope"
964
+ },
965
+ "rope_theta": 10000.0,
966
+ "sliding_window": 262144,
967
+ "tie_word_embeddings": false,
968
+ "torch_dtype": "bfloat16",
969
+ "transformers_version": "4.50.0",
970
+ "use_cache": true,
971
+ "vocab_size": 32064
972
+ }
973
+
974
+ [2025-03-24 20:00:05,315][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-96] due to args.save_total_limit
975
+ [2025-03-24 20:00:05,332][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-128] due to args.save_total_limit
976
+ [2025-03-24 20:06:02,991][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades. If id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
977
+ [2025-03-24 20:06:02,994][transformers.trainer][INFO] -
978
+ ***** Running Evaluation *****
979
+ [2025-03-24 20:06:02,994][transformers.trainer][INFO] - Num examples = 132
980
+ [2025-03-24 20:06:02,994][transformers.trainer][INFO] - Batch size = 16
981
+ [2025-03-24 20:06:21,333][transformers][INFO] - {'accuracy': 0.42424242424242425, 'RMSE': 48.36728170358491, 'QWK': 0.2587290502793297, 'HDIV': 0.022727272727272707, 'Macro_F1': 0.1805673137741449, 'Micro_F1': 0.42424242424242425, 'Weighted_F1': 0.36980931781690796, 'Macro_F1_(ignoring_nan)': np.float64(0.2708509706612173)}
982
+ [2025-03-24 20:06:21,334][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
983
+ [2025-03-24 20:06:21,337][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-192
984
+ [2025-03-24 20:06:21,788][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
985
+ [2025-03-24 20:06:21,789][transformers.configuration_utils][INFO] - Model config Phi3Config {
986
+ "architectures": [
987
+ "Phi3ForCausalLM"
988
+ ],
989
+ "attention_bias": false,
990
+ "attention_dropout": 0.0,
991
+ "auto_map": {
992
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
993
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
994
+ },
995
+ "bos_token_id": 1,
996
+ "embd_pdrop": 0.0,
997
+ "eos_token_id": 32000,
998
+ "hidden_act": "silu",
999
+ "hidden_size": 3072,
1000
+ "initializer_range": 0.02,
1001
+ "intermediate_size": 8192,
1002
+ "max_position_embeddings": 131072,
1003
+ "model_type": "phi3",
1004
+ "num_attention_heads": 32,
1005
+ "num_hidden_layers": 32,
1006
+ "num_key_value_heads": 32,
1007
+ "original_max_position_embeddings": 4096,
1008
+ "pad_token_id": 32000,
1009
+ "partial_rotary_factor": 1.0,
1010
+ "resid_pdrop": 0.0,
1011
+ "rms_norm_eps": 1e-05,
1012
+ "rope_scaling": {
1013
+ "long_factor": [
1014
+ 1.0800000429153442,
1015
+ 1.1100000143051147,
1016
+ 1.1399999856948853,
1017
+ 1.340000033378601,
1018
+ 1.5899999141693115,
1019
+ 1.600000023841858,
1020
+ 1.6200000047683716,
1021
+ 2.620000123977661,
1022
+ 3.2300000190734863,
1023
+ 3.2300000190734863,
1024
+ 4.789999961853027,
1025
+ 7.400000095367432,
1026
+ 7.700000286102295,
1027
+ 9.09000015258789,
1028
+ 12.199999809265137,
1029
+ 17.670000076293945,
1030
+ 24.46000099182129,
1031
+ 28.57000160217285,
1032
+ 30.420001983642578,
1033
+ 30.840002059936523,
1034
+ 32.590003967285156,
1035
+ 32.93000411987305,
1036
+ 42.320003509521484,
1037
+ 44.96000289916992,
1038
+ 50.340003967285156,
1039
+ 50.45000457763672,
1040
+ 57.55000305175781,
1041
+ 57.93000411987305,
1042
+ 58.21000289916992,
1043
+ 60.1400032043457,
1044
+ 62.61000442504883,
1045
+ 62.62000274658203,
1046
+ 62.71000289916992,
1047
+ 63.1400032043457,
1048
+ 63.1400032043457,
1049
+ 63.77000427246094,
1050
+ 63.93000411987305,
1051
+ 63.96000289916992,
1052
+ 63.970001220703125,
1053
+ 64.02999877929688,
1054
+ 64.06999969482422,
1055
+ 64.08000183105469,
1056
+ 64.12000274658203,
1057
+ 64.41000366210938,
1058
+ 64.4800033569336,
1059
+ 64.51000213623047,
1060
+ 64.52999877929688,
1061
+ 64.83999633789062
1062
+ ],
1063
+ "short_factor": [
1064
+ 1.0,
1065
+ 1.0199999809265137,
1066
+ 1.0299999713897705,
1067
+ 1.0299999713897705,
1068
+ 1.0499999523162842,
1069
+ 1.0499999523162842,
1070
+ 1.0499999523162842,
1071
+ 1.0499999523162842,
1072
+ 1.0499999523162842,
1073
+ 1.0699999332427979,
1074
+ 1.0999999046325684,
1075
+ 1.1099998950958252,
1076
+ 1.1599998474121094,
1077
+ 1.1599998474121094,
1078
+ 1.1699998378753662,
1079
+ 1.2899998426437378,
1080
+ 1.339999794960022,
1081
+ 1.679999828338623,
1082
+ 1.7899998426437378,
1083
+ 1.8199998140335083,
1084
+ 1.8499997854232788,
1085
+ 1.8799997568130493,
1086
+ 1.9099997282028198,
1087
+ 1.9399996995925903,
1088
+ 1.9899996519088745,
1089
+ 2.0199997425079346,
1090
+ 2.0199997425079346,
1091
+ 2.0199997425079346,
1092
+ 2.0199997425079346,
1093
+ 2.0199997425079346,
1094
+ 2.0199997425079346,
1095
+ 2.0299997329711914,
1096
+ 2.0299997329711914,
1097
+ 2.0299997329711914,
1098
+ 2.0299997329711914,
1099
+ 2.0299997329711914,
1100
+ 2.0299997329711914,
1101
+ 2.0299997329711914,
1102
+ 2.0299997329711914,
1103
+ 2.0299997329711914,
1104
+ 2.0799996852874756,
1105
+ 2.0899996757507324,
1106
+ 2.189999580383301,
1107
+ 2.2199995517730713,
1108
+ 2.5899994373321533,
1109
+ 2.729999542236328,
1110
+ 2.749999523162842,
1111
+ 2.8399994373321533
1112
+ ],
1113
+ "type": "longrope"
1114
+ },
1115
+ "rope_theta": 10000.0,
1116
+ "sliding_window": 262144,
1117
+ "tie_word_embeddings": false,
1118
+ "torch_dtype": "bfloat16",
1119
+ "transformers_version": "4.50.0",
1120
+ "use_cache": true,
1121
+ "vocab_size": 32064
1122
+ }
1123
+
1124
+ [2025-03-24 20:12:27,944][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades. If id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
1125
+ [2025-03-24 20:12:27,946][transformers.trainer][INFO] -
1126
+ ***** Running Evaluation *****
1127
+ [2025-03-24 20:12:27,946][transformers.trainer][INFO] - Num examples = 132
1128
+ [2025-03-24 20:12:27,946][transformers.trainer][INFO] - Batch size = 16
1129
+ [2025-03-24 20:12:46,237][transformers][INFO] - {'accuracy': 0.3560606060606061, 'RMSE': 51.75700801618925, 'QWK': 0.33579234972677596, 'HDIV': 0.05303030303030298, 'Macro_F1': 0.19786302390851182, 'Micro_F1': 0.3560606060606061, 'Weighted_F1': 0.3335142643286444, 'Macro_F1_(ignoring_nan)': np.float64(0.2967945358627677)}
1130
+ [2025-03-24 20:12:46,237][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
1131
+ [2025-03-24 20:12:46,240][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-224
1132
+ [2025-03-24 20:12:46,833][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
1133
+ [2025-03-24 20:12:46,835][transformers.configuration_utils][INFO] - Model config Phi3Config {
1134
+ "architectures": [
1135
+ "Phi3ForCausalLM"
1136
+ ],
1137
+ "attention_bias": false,
1138
+ "attention_dropout": 0.0,
1139
+ "auto_map": {
1140
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
1141
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
1142
+ },
1143
+ "bos_token_id": 1,
1144
+ "embd_pdrop": 0.0,
1145
+ "eos_token_id": 32000,
1146
+ "hidden_act": "silu",
1147
+ "hidden_size": 3072,
1148
+ "initializer_range": 0.02,
1149
+ "intermediate_size": 8192,
1150
+ "max_position_embeddings": 131072,
1151
+ "model_type": "phi3",
1152
+ "num_attention_heads": 32,
1153
+ "num_hidden_layers": 32,
1154
+ "num_key_value_heads": 32,
1155
+ "original_max_position_embeddings": 4096,
1156
+ "pad_token_id": 32000,
1157
+ "partial_rotary_factor": 1.0,
1158
+ "resid_pdrop": 0.0,
1159
+ "rms_norm_eps": 1e-05,
1160
+ "rope_scaling": {
1161
+ "long_factor": [
1162
+ 1.0800000429153442,
1163
+ 1.1100000143051147,
1164
+ 1.1399999856948853,
1165
+ 1.340000033378601,
1166
+ 1.5899999141693115,
1167
+ 1.600000023841858,
1168
+ 1.6200000047683716,
1169
+ 2.620000123977661,
1170
+ 3.2300000190734863,
1171
+ 3.2300000190734863,
1172
+ 4.789999961853027,
1173
+ 7.400000095367432,
1174
+ 7.700000286102295,
1175
+ 9.09000015258789,
1176
+ 12.199999809265137,
1177
+ 17.670000076293945,
1178
+ 24.46000099182129,
1179
+ 28.57000160217285,
1180
+ 30.420001983642578,
1181
+ 30.840002059936523,
1182
+ 32.590003967285156,
1183
+ 32.93000411987305,
1184
+ 42.320003509521484,
1185
+ 44.96000289916992,
1186
+ 50.340003967285156,
1187
+ 50.45000457763672,
1188
+ 57.55000305175781,
1189
+ 57.93000411987305,
1190
+ 58.21000289916992,
1191
+ 60.1400032043457,
1192
+ 62.61000442504883,
1193
+ 62.62000274658203,
1194
+ 62.71000289916992,
1195
+ 63.1400032043457,
1196
+ 63.1400032043457,
1197
+ 63.77000427246094,
1198
+ 63.93000411987305,
1199
+ 63.96000289916992,
1200
+ 63.970001220703125,
1201
+ 64.02999877929688,
1202
+ 64.06999969482422,
1203
+ 64.08000183105469,
1204
+ 64.12000274658203,
1205
+ 64.41000366210938,
1206
+ 64.4800033569336,
1207
+ 64.51000213623047,
1208
+ 64.52999877929688,
1209
+ 64.83999633789062
1210
+ ],
1211
+ "short_factor": [
1212
+ 1.0,
1213
+ 1.0199999809265137,
1214
+ 1.0299999713897705,
1215
+ 1.0299999713897705,
1216
+ 1.0499999523162842,
1217
+ 1.0499999523162842,
1218
+ 1.0499999523162842,
1219
+ 1.0499999523162842,
1220
+ 1.0499999523162842,
1221
+ 1.0699999332427979,
1222
+ 1.0999999046325684,
1223
+ 1.1099998950958252,
1224
+ 1.1599998474121094,
1225
+ 1.1599998474121094,
1226
+ 1.1699998378753662,
1227
+ 1.2899998426437378,
1228
+ 1.339999794960022,
1229
+ 1.679999828338623,
1230
+ 1.7899998426437378,
1231
+ 1.8199998140335083,
1232
+ 1.8499997854232788,
1233
+ 1.8799997568130493,
1234
+ 1.9099997282028198,
1235
+ 1.9399996995925903,
1236
+ 1.9899996519088745,
1237
+ 2.0199997425079346,
1238
+ 2.0199997425079346,
1239
+ 2.0199997425079346,
1240
+ 2.0199997425079346,
1241
+ 2.0199997425079346,
1242
+ 2.0199997425079346,
1243
+ 2.0299997329711914,
1244
+ 2.0299997329711914,
1245
+ 2.0299997329711914,
1246
+ 2.0299997329711914,
1247
+ 2.0299997329711914,
1248
+ 2.0299997329711914,
1249
+ 2.0299997329711914,
1250
+ 2.0299997329711914,
1251
+ 2.0299997329711914,
1252
+ 2.0799996852874756,
1253
+ 2.0899996757507324,
1254
+ 2.189999580383301,
1255
+ 2.2199995517730713,
1256
+ 2.5899994373321533,
1257
+ 2.729999542236328,
1258
+ 2.749999523162842,
1259
+ 2.8399994373321533
1260
+ ],
1261
+ "type": "longrope"
1262
+ },
1263
+ "rope_theta": 10000.0,
1264
+ "sliding_window": 262144,
1265
+ "tie_word_embeddings": false,
1266
+ "torch_dtype": "bfloat16",
1267
+ "transformers_version": "4.50.0",
1268
+ "use_cache": true,
1269
+ "vocab_size": 32064
1270
+ }
1271
+
1272
+ [2025-03-24 20:12:55,009][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-192] due to args.save_total_limit
1273
+ [2025-03-24 20:18:52,733][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades. If id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
1274
+ [2025-03-24 20:18:52,736][transformers.trainer][INFO] -
1275
+ ***** Running Evaluation *****
1276
+ [2025-03-24 20:18:52,736][transformers.trainer][INFO] - Num examples = 132
1277
+ [2025-03-24 20:18:52,736][transformers.trainer][INFO] - Batch size = 16
1278
+ [2025-03-24 20:19:11,023][transformers][INFO] - {'accuracy': 0.3333333333333333, 'RMSE': 61.00173867368143, 'QWK': 0.3459651387992253, 'HDIV': 0.06060606060606055, 'Macro_F1': 0.26468434343434344, 'Micro_F1': 0.3333333333333333, 'Weighted_F1': 0.33276419498010407, 'Macro_F1_(ignoring_nan)': np.float64(0.33085542929292927)}
1279
+ [2025-03-24 20:19:11,024][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
1280
+ [2025-03-24 20:19:11,026][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-256
1281
+ [2025-03-24 20:19:11,512][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
1282
+ [2025-03-24 20:19:11,513][transformers.configuration_utils][INFO] - Model config Phi3Config {
1283
+ "architectures": [
1284
+ "Phi3ForCausalLM"
1285
+ ],
1286
+ "attention_bias": false,
1287
+ "attention_dropout": 0.0,
1288
+ "auto_map": {
1289
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
1290
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
1291
+ },
1292
+ "bos_token_id": 1,
1293
+ "embd_pdrop": 0.0,
1294
+ "eos_token_id": 32000,
1295
+ "hidden_act": "silu",
1296
+ "hidden_size": 3072,
1297
+ "initializer_range": 0.02,
1298
+ "intermediate_size": 8192,
1299
+ "max_position_embeddings": 131072,
1300
+ "model_type": "phi3",
1301
+ "num_attention_heads": 32,
1302
+ "num_hidden_layers": 32,
1303
+ "num_key_value_heads": 32,
1304
+ "original_max_position_embeddings": 4096,
1305
+ "pad_token_id": 32000,
1306
+ "partial_rotary_factor": 1.0,
1307
+ "resid_pdrop": 0.0,
1308
+ "rms_norm_eps": 1e-05,
1309
+ "rope_scaling": {
1310
+ "long_factor": [
1311
+ 1.0800000429153442,
1312
+ 1.1100000143051147,
1313
+ 1.1399999856948853,
1314
+ 1.340000033378601,
1315
+ 1.5899999141693115,
1316
+ 1.600000023841858,
1317
+ 1.6200000047683716,
1318
+ 2.620000123977661,
1319
+ 3.2300000190734863,
1320
+ 3.2300000190734863,
1321
+ 4.789999961853027,
1322
+ 7.400000095367432,
1323
+ 7.700000286102295,
1324
+ 9.09000015258789,
1325
+ 12.199999809265137,
1326
+ 17.670000076293945,
1327
+ 24.46000099182129,
1328
+ 28.57000160217285,
1329
+ 30.420001983642578,
1330
+ 30.840002059936523,
1331
+ 32.590003967285156,
1332
+ 32.93000411987305,
1333
+ 42.320003509521484,
1334
+ 44.96000289916992,
1335
+ 50.340003967285156,
1336
+ 50.45000457763672,
1337
+ 57.55000305175781,
1338
+ 57.93000411987305,
1339
+ 58.21000289916992,
1340
+ 60.1400032043457,
1341
+ 62.61000442504883,
1342
+ 62.62000274658203,
1343
+ 62.71000289916992,
1344
+ 63.1400032043457,
1345
+ 63.1400032043457,
1346
+ 63.77000427246094,
1347
+ 63.93000411987305,
1348
+ 63.96000289916992,
1349
+ 63.970001220703125,
1350
+ 64.02999877929688,
1351
+ 64.06999969482422,
1352
+ 64.08000183105469,
1353
+ 64.12000274658203,
1354
+ 64.41000366210938,
1355
+ 64.4800033569336,
1356
+ 64.51000213623047,
1357
+ 64.52999877929688,
1358
+ 64.83999633789062
1359
+ ],
1360
+ "short_factor": [
1361
+ 1.0,
1362
+ 1.0199999809265137,
1363
+ 1.0299999713897705,
1364
+ 1.0299999713897705,
1365
+ 1.0499999523162842,
1366
+ 1.0499999523162842,
1367
+ 1.0499999523162842,
1368
+ 1.0499999523162842,
1369
+ 1.0499999523162842,
1370
+ 1.0699999332427979,
1371
+ 1.0999999046325684,
1372
+ 1.1099998950958252,
1373
+ 1.1599998474121094,
1374
+ 1.1599998474121094,
1375
+ 1.1699998378753662,
1376
+ 1.2899998426437378,
1377
+ 1.339999794960022,
1378
+ 1.679999828338623,
1379
+ 1.7899998426437378,
1380
+ 1.8199998140335083,
1381
+ 1.8499997854232788,
1382
+ 1.8799997568130493,
1383
+ 1.9099997282028198,
1384
+ 1.9399996995925903,
1385
+ 1.9899996519088745,
1386
+ 2.0199997425079346,
1387
+ 2.0199997425079346,
1388
+ 2.0199997425079346,
1389
+ 2.0199997425079346,
1390
+ 2.0199997425079346,
1391
+ 2.0199997425079346,
1392
+ 2.0299997329711914,
1393
+ 2.0299997329711914,
1394
+ 2.0299997329711914,
1395
+ 2.0299997329711914,
1396
+ 2.0299997329711914,
1397
+ 2.0299997329711914,
1398
+ 2.0299997329711914,
1399
+ 2.0299997329711914,
1400
+ 2.0299997329711914,
1401
+ 2.0799996852874756,
1402
+ 2.0899996757507324,
1403
+ 2.189999580383301,
1404
+ 2.2199995517730713,
1405
+ 2.5899994373321533,
1406
+ 2.729999542236328,
1407
+ 2.749999523162842,
1408
+ 2.8399994373321533
1409
+ ],
1410
+ "type": "longrope"
1411
+ },
1412
+ "rope_theta": 10000.0,
1413
+ "sliding_window": 262144,
1414
+ "tie_word_embeddings": false,
1415
+ "torch_dtype": "bfloat16",
1416
+ "transformers_version": "4.50.0",
1417
+ "use_cache": true,
1418
+ "vocab_size": 32064
1419
+ }
1420
+
1421
+ [2025-03-24 20:19:19,615][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-224] due to args.save_total_limit
1422
+ [2025-03-24 20:25:17,239][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades. If id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
1423
+ [2025-03-24 20:25:17,242][transformers.trainer][INFO] -
1424
+ ***** Running Evaluation *****
1425
+ [2025-03-24 20:25:17,242][transformers.trainer][INFO] - Num examples = 132
1426
+ [2025-03-24 20:25:17,242][transformers.trainer][INFO] - Batch size = 16
1427
+ [2025-03-24 20:25:35,922][transformers][INFO] - {'accuracy': 0.4166666666666667, 'RMSE': 55.37749241945383, 'QWK': 0.3101710319755432, 'HDIV': 0.045454545454545414, 'Macro_F1': 0.23327141209785254, 'Micro_F1': 0.4166666666666667, 'Weighted_F1': 0.3928450028485377, 'Macro_F1_(ignoring_nan)': np.float64(0.34990711814677883)}
1428
+ [2025-03-24 20:25:35,923][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
1429
+ [2025-03-24 20:25:35,926][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-288
1430
+ [2025-03-24 20:25:36,379][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
1431
+ [2025-03-24 20:25:36,381][transformers.configuration_utils][INFO] - Model config Phi3Config {
1432
+ "architectures": [
1433
+ "Phi3ForCausalLM"
1434
+ ],
1435
+ "attention_bias": false,
1436
+ "attention_dropout": 0.0,
1437
+ "auto_map": {
1438
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
1439
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
1440
+ },
1441
+ "bos_token_id": 1,
1442
+ "embd_pdrop": 0.0,
1443
+ "eos_token_id": 32000,
1444
+ "hidden_act": "silu",
1445
+ "hidden_size": 3072,
1446
+ "initializer_range": 0.02,
1447
+ "intermediate_size": 8192,
1448
+ "max_position_embeddings": 131072,
1449
+ "model_type": "phi3",
1450
+ "num_attention_heads": 32,
1451
+ "num_hidden_layers": 32,
1452
+ "num_key_value_heads": 32,
1453
+ "original_max_position_embeddings": 4096,
1454
+ "pad_token_id": 32000,
1455
+ "partial_rotary_factor": 1.0,
1456
+ "resid_pdrop": 0.0,
1457
+ "rms_norm_eps": 1e-05,
1458
+ "rope_scaling": {
1459
+ "long_factor": [
1460
+ 1.0800000429153442,
1461
+ 1.1100000143051147,
1462
+ 1.1399999856948853,
1463
+ 1.340000033378601,
1464
+ 1.5899999141693115,
1465
+ 1.600000023841858,
1466
+ 1.6200000047683716,
1467
+ 2.620000123977661,
1468
+ 3.2300000190734863,
1469
+ 3.2300000190734863,
1470
+ 4.789999961853027,
1471
+ 7.400000095367432,
1472
+ 7.700000286102295,
1473
+ 9.09000015258789,
1474
+ 12.199999809265137,
1475
+ 17.670000076293945,
1476
+ 24.46000099182129,
1477
+ 28.57000160217285,
1478
+ 30.420001983642578,
1479
+ 30.840002059936523,
1480
+ 32.590003967285156,
1481
+ 32.93000411987305,
1482
+ 42.320003509521484,
1483
+ 44.96000289916992,
1484
+ 50.340003967285156,
1485
+ 50.45000457763672,
1486
+ 57.55000305175781,
1487
+ 57.93000411987305,
1488
+ 58.21000289916992,
1489
+ 60.1400032043457,
1490
+ 62.61000442504883,
1491
+ 62.62000274658203,
1492
+ 62.71000289916992,
1493
+ 63.1400032043457,
1494
+ 63.1400032043457,
1495
+ 63.77000427246094,
1496
+ 63.93000411987305,
1497
+ 63.96000289916992,
1498
+ 63.970001220703125,
1499
+ 64.02999877929688,
1500
+ 64.06999969482422,
1501
+ 64.08000183105469,
1502
+ 64.12000274658203,
1503
+ 64.41000366210938,
1504
+ 64.4800033569336,
1505
+ 64.51000213623047,
1506
+ 64.52999877929688,
1507
+ 64.83999633789062
1508
+ ],
1509
+ "short_factor": [
1510
+ 1.0,
1511
+ 1.0199999809265137,
1512
+ 1.0299999713897705,
1513
+ 1.0299999713897705,
1514
+ 1.0499999523162842,
1515
+ 1.0499999523162842,
1516
+ 1.0499999523162842,
1517
+ 1.0499999523162842,
1518
+ 1.0499999523162842,
1519
+ 1.0699999332427979,
1520
+ 1.0999999046325684,
1521
+ 1.1099998950958252,
1522
+ 1.1599998474121094,
1523
+ 1.1599998474121094,
1524
+ 1.1699998378753662,
1525
+ 1.2899998426437378,
1526
+ 1.339999794960022,
1527
+ 1.679999828338623,
1528
+ 1.7899998426437378,
1529
+ 1.8199998140335083,
1530
+ 1.8499997854232788,
1531
+ 1.8799997568130493,
1532
+ 1.9099997282028198,
1533
+ 1.9399996995925903,
1534
+ 1.9899996519088745,
1535
+ 2.0199997425079346,
1536
+ 2.0199997425079346,
1537
+ 2.0199997425079346,
1538
+ 2.0199997425079346,
1539
+ 2.0199997425079346,
1540
+ 2.0199997425079346,
1541
+ 2.0299997329711914,
1542
+ 2.0299997329711914,
1543
+ 2.0299997329711914,
1544
+ 2.0299997329711914,
1545
+ 2.0299997329711914,
1546
+ 2.0299997329711914,
1547
+ 2.0299997329711914,
1548
+ 2.0299997329711914,
1549
+ 2.0299997329711914,
1550
+ 2.0799996852874756,
1551
+ 2.0899996757507324,
1552
+ 2.189999580383301,
1553
+ 2.2199995517730713,
1554
+ 2.5899994373321533,
1555
+ 2.729999542236328,
1556
+ 2.749999523162842,
1557
+ 2.8399994373321533
1558
+ ],
1559
+ "type": "longrope"
1560
+ },
1561
+ "rope_theta": 10000.0,
1562
+ "sliding_window": 262144,
1563
+ "tie_word_embeddings": false,
1564
+ "torch_dtype": "bfloat16",
1565
+ "transformers_version": "4.50.0",
1566
+ "use_cache": true,
1567
+ "vocab_size": 32064
1568
+ }
1569
+
1570
+ [2025-03-24 20:25:44,605][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-256] due to args.save_total_limit
1571
+ [2025-03-24 20:31:42,515][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades. If id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
1572
+ [2025-03-24 20:31:42,518][transformers.trainer][INFO] -
1573
+ ***** Running Evaluation *****
1574
+ [2025-03-24 20:31:42,518][transformers.trainer][INFO] - Num examples = 132
1575
+ [2025-03-24 20:31:42,518][transformers.trainer][INFO] - Batch size = 16
1576
+ [2025-03-24 20:32:01,004][transformers][INFO] - {'accuracy': 0.45454545454545453, 'RMSE': 53.143601912576685, 'QWK': 0.24462127910403786, 'HDIV': 0.045454545454545414, 'Macro_F1': 0.27681543857724816, 'Micro_F1': 0.45454545454545453, 'Weighted_F1': 0.42073346116794536, 'Macro_F1_(ignoring_nan)': np.float64(0.3460192982215602)}
1577
+ [2025-03-24 20:32:01,005][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
1578
+ [2025-03-24 20:32:01,008][transformers.trainer][INFO] - Saving model checkpoint to /workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-320
1579
+ [2025-03-24 20:32:01,457][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
1580
+ [2025-03-24 20:32:01,458][transformers.configuration_utils][INFO] - Model config Phi3Config {
1581
+ "architectures": [
1582
+ "Phi3ForCausalLM"
1583
+ ],
1584
+ "attention_bias": false,
1585
+ "attention_dropout": 0.0,
1586
+ "auto_map": {
1587
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
1588
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
1589
+ },
1590
+ "bos_token_id": 1,
1591
+ "embd_pdrop": 0.0,
1592
+ "eos_token_id": 32000,
1593
+ "hidden_act": "silu",
1594
+ "hidden_size": 3072,
1595
+ "initializer_range": 0.02,
1596
+ "intermediate_size": 8192,
1597
+ "max_position_embeddings": 131072,
1598
+ "model_type": "phi3",
1599
+ "num_attention_heads": 32,
1600
+ "num_hidden_layers": 32,
1601
+ "num_key_value_heads": 32,
1602
+ "original_max_position_embeddings": 4096,
1603
+ "pad_token_id": 32000,
1604
+ "partial_rotary_factor": 1.0,
1605
+ "resid_pdrop": 0.0,
1606
+ "rms_norm_eps": 1e-05,
1607
+ "rope_scaling": {
1608
+ "long_factor": [
1609
+ 1.0800000429153442,
1610
+ 1.1100000143051147,
1611
+ 1.1399999856948853,
1612
+ 1.340000033378601,
1613
+ 1.5899999141693115,
1614
+ 1.600000023841858,
1615
+ 1.6200000047683716,
1616
+ 2.620000123977661,
1617
+ 3.2300000190734863,
1618
+ 3.2300000190734863,
1619
+ 4.789999961853027,
1620
+ 7.400000095367432,
1621
+ 7.700000286102295,
1622
+ 9.09000015258789,
1623
+ 12.199999809265137,
1624
+ 17.670000076293945,
1625
+ 24.46000099182129,
1626
+ 28.57000160217285,
1627
+ 30.420001983642578,
1628
+ 30.840002059936523,
1629
+ 32.590003967285156,
1630
+ 32.93000411987305,
1631
+ 42.320003509521484,
1632
+ 44.96000289916992,
1633
+ 50.340003967285156,
1634
+ 50.45000457763672,
1635
+ 57.55000305175781,
1636
+ 57.93000411987305,
1637
+ 58.21000289916992,
1638
+ 60.1400032043457,
1639
+ 62.61000442504883,
1640
+ 62.62000274658203,
1641
+ 62.71000289916992,
1642
+ 63.1400032043457,
1643
+ 63.1400032043457,
1644
+ 63.77000427246094,
1645
+ 63.93000411987305,
1646
+ 63.96000289916992,
1647
+ 63.970001220703125,
1648
+ 64.02999877929688,
1649
+ 64.06999969482422,
1650
+ 64.08000183105469,
1651
+ 64.12000274658203,
1652
+ 64.41000366210938,
1653
+ 64.4800033569336,
1654
+ 64.51000213623047,
1655
+ 64.52999877929688,
1656
+ 64.83999633789062
1657
+ ],
1658
+ "short_factor": [
1659
+ 1.0,
1660
+ 1.0199999809265137,
1661
+ 1.0299999713897705,
1662
+ 1.0299999713897705,
1663
+ 1.0499999523162842,
1664
+ 1.0499999523162842,
1665
+ 1.0499999523162842,
1666
+ 1.0499999523162842,
1667
+ 1.0499999523162842,
1668
+ 1.0699999332427979,
1669
+ 1.0999999046325684,
1670
+ 1.1099998950958252,
1671
+ 1.1599998474121094,
1672
+ 1.1599998474121094,
1673
+ 1.1699998378753662,
1674
+ 1.2899998426437378,
1675
+ 1.339999794960022,
1676
+ 1.679999828338623,
1677
+ 1.7899998426437378,
1678
+ 1.8199998140335083,
1679
+ 1.8499997854232788,
1680
+ 1.8799997568130493,
1681
+ 1.9099997282028198,
1682
+ 1.9399996995925903,
1683
+ 1.9899996519088745,
1684
+ 2.0199997425079346,
1685
+ 2.0199997425079346,
1686
+ 2.0199997425079346,
1687
+ 2.0199997425079346,
1688
+ 2.0199997425079346,
1689
+ 2.0199997425079346,
1690
+ 2.0299997329711914,
1691
+ 2.0299997329711914,
1692
+ 2.0299997329711914,
1693
+ 2.0299997329711914,
1694
+ 2.0299997329711914,
1695
+ 2.0299997329711914,
1696
+ 2.0299997329711914,
1697
+ 2.0299997329711914,
1698
+ 2.0299997329711914,
1699
+ 2.0799996852874756,
1700
+ 2.0899996757507324,
1701
+ 2.189999580383301,
1702
+ 2.2199995517730713,
1703
+ 2.5899994373321533,
1704
+ 2.729999542236328,
1705
+ 2.749999523162842,
1706
+ 2.8399994373321533
1707
+ ],
1708
+ "type": "longrope"
1709
+ },
1710
+ "rope_theta": 10000.0,
1711
+ "sliding_window": 262144,
1712
+ "tie_word_embeddings": false,
1713
+ "torch_dtype": "bfloat16",
1714
+ "transformers_version": "4.50.0",
1715
+ "use_cache": true,
1716
+ "vocab_size": 32064
1717
+ }
1718
+
1719
+ [2025-03-24 20:32:09,598][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-288] due to args.save_total_limit
1720
+ [2025-03-24 20:32:09,615][transformers.trainer][INFO] -
1721
+
1722
+ Training completed. Do not forget to share your model on huggingface.co/models =)
1723
+
1724
+
1725
+ [2025-03-24 20:32:09,616][transformers.trainer][INFO] - Loading best model from /workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-160 (score: 0.35724465558194773).
1726
+ [2025-03-24 20:32:31,185][transformers.trainer][INFO] - Deleting older checkpoint [/workspace/jbcs2025/outputs/2025-03-24/19-26-35/results/phi35-balanced/C2/checkpoint-320] due to args.save_total_limit
1727
+ [2025-03-24 20:32:31,204][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades. If id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
1728
+ [2025-03-24 20:32:31,207][transformers.trainer][INFO] -
1729
+ ***** Running Evaluation *****
1730
+ [2025-03-24 20:32:31,207][transformers.trainer][INFO] - Num examples = 132
1731
+ [2025-03-24 20:32:31,207][transformers.trainer][INFO] - Batch size = 16
1732
+ [2025-03-24 20:32:49,532][transformers][INFO] - {'accuracy': 0.38636363636363635, 'RMSE': 58.981250230796896, 'QWK': 0.35724465558194773, 'HDIV': 0.037878787878787845, 'Macro_F1': 0.21283275639401986, 'Micro_F1': 0.38636363636363635, 'Weighted_F1': 0.33311875189695395, 'Macro_F1_(ignoring_nan)': np.float64(0.3547212606566998)}
1733
+ [2025-03-24 20:32:49,535][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
1734
+ [2025-03-24 20:32:49,536][__main__][INFO] - Training completed successfully.
1735
+ [2025-03-24 20:32:49,537][__main__][INFO] - Running on Test
1736
+ [2025-03-24 20:32:49,537][transformers.trainer][INFO] - The following columns in the evaluation set don't have a corresponding argument in `PeftModelForSequenceClassification.forward` and have been ignored: id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades. If id_prompt, essay_text, id, reference, prompt, essay_year, supporting_text, grades are not expected by `PeftModelForSequenceClassification.forward`, you can safely ignore this message.
1737
+ [2025-03-24 20:32:49,539][transformers.trainer][INFO] -
1738
+ ***** Running Evaluation *****
1739
+ [2025-03-24 20:32:49,539][transformers.trainer][INFO] - Num examples = 138
1740
+ [2025-03-24 20:32:49,539][transformers.trainer][INFO] - Batch size = 16
1741
+ [2025-03-24 20:33:09,361][transformers][INFO] - {'accuracy': 0.36231884057971014, 'RMSE': 65.31972647421809, 'QWK': 0.3441810010847668, 'HDIV': 0.07246376811594202, 'Macro_F1': 0.24235405087851392, 'Micro_F1': 0.36231884057971014, 'Weighted_F1': 0.32848489320136964, 'Macro_F1_(ignoring_nan)': np.float64(0.3635310763177709)}
1742
+ [2025-03-24 20:33:09,362][tensorboardX.summary][INFO] - Summary name eval/Macro_F1_(ignoring_nan) is illegal; using eval/Macro_F1__ignoring_nan_ instead.
1743
+ [2025-03-24 20:33:09,364][transformers.trainer][INFO] - Saving model checkpoint to ./results/phi35-balanced/C2/best_model
1744
+ [2025-03-24 20:33:09,802][transformers.configuration_utils][INFO] - loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--microsoft--Phi-3.5-mini-instruct/snapshots/3145e03a9fd4cdd7cd953c34d9bbf7ad606122ca/config.json
1745
+ [2025-03-24 20:33:09,803][transformers.configuration_utils][INFO] - Model config Phi3Config {
1746
+ "architectures": [
1747
+ "Phi3ForCausalLM"
1748
+ ],
1749
+ "attention_bias": false,
1750
+ "attention_dropout": 0.0,
1751
+ "auto_map": {
1752
+ "AutoConfig": "microsoft/Phi-3.5-mini-instruct--configuration_phi3.Phi3Config",
1753
+ "AutoModelForCausalLM": "microsoft/Phi-3.5-mini-instruct--modeling_phi3.Phi3ForCausalLM"
1754
+ },
1755
+ "bos_token_id": 1,
1756
+ "embd_pdrop": 0.0,
1757
+ "eos_token_id": 32000,
1758
+ "hidden_act": "silu",
1759
+ "hidden_size": 3072,
1760
+ "initializer_range": 0.02,
1761
+ "intermediate_size": 8192,
1762
+ "max_position_embeddings": 131072,
1763
+ "model_type": "phi3",
1764
+ "num_attention_heads": 32,
1765
+ "num_hidden_layers": 32,
1766
+ "num_key_value_heads": 32,
1767
+ "original_max_position_embeddings": 4096,
1768
+ "pad_token_id": 32000,
1769
+ "partial_rotary_factor": 1.0,
1770
+ "resid_pdrop": 0.0,
1771
+ "rms_norm_eps": 1e-05,
1772
+ "rope_scaling": {
1773
+ "long_factor": [
1774
+ 1.0800000429153442,
1775
+ 1.1100000143051147,
1776
+ 1.1399999856948853,
1777
+ 1.340000033378601,
1778
+ 1.5899999141693115,
1779
+ 1.600000023841858,
1780
+ 1.6200000047683716,
1781
+ 2.620000123977661,
1782
+ 3.2300000190734863,
1783
+ 3.2300000190734863,
1784
+ 4.789999961853027,
1785
+ 7.400000095367432,
1786
+ 7.700000286102295,
1787
+ 9.09000015258789,
1788
+ 12.199999809265137,
1789
+ 17.670000076293945,
1790
+ 24.46000099182129,
1791
+ 28.57000160217285,
1792
+ 30.420001983642578,
1793
+ 30.840002059936523,
1794
+ 32.590003967285156,
1795
+ 32.93000411987305,
1796
+ 42.320003509521484,
1797
+ 44.96000289916992,
1798
+ 50.340003967285156,
1799
+ 50.45000457763672,
1800
+ 57.55000305175781,
1801
+ 57.93000411987305,
1802
+ 58.21000289916992,
1803
+ 60.1400032043457,
1804
+ 62.61000442504883,
1805
+ 62.62000274658203,
1806
+ 62.71000289916992,
1807
+ 63.1400032043457,
1808
+ 63.1400032043457,
1809
+ 63.77000427246094,
1810
+ 63.93000411987305,
1811
+ 63.96000289916992,
1812
+ 63.970001220703125,
1813
+ 64.02999877929688,
1814
+ 64.06999969482422,
1815
+ 64.08000183105469,
1816
+ 64.12000274658203,
1817
+ 64.41000366210938,
1818
+ 64.4800033569336,
1819
+ 64.51000213623047,
1820
+ 64.52999877929688,
1821
+ 64.83999633789062
1822
+ ],
1823
+ "short_factor": [
1824
+ 1.0,
1825
+ 1.0199999809265137,
1826
+ 1.0299999713897705,
1827
+ 1.0299999713897705,
1828
+ 1.0499999523162842,
1829
+ 1.0499999523162842,
1830
+ 1.0499999523162842,
1831
+ 1.0499999523162842,
1832
+ 1.0499999523162842,
1833
+ 1.0699999332427979,
1834
+ 1.0999999046325684,
1835
+ 1.1099998950958252,
1836
+ 1.1599998474121094,
1837
+ 1.1599998474121094,
1838
+ 1.1699998378753662,
1839
+ 1.2899998426437378,
1840
+ 1.339999794960022,
1841
+ 1.679999828338623,
1842
+ 1.7899998426437378,
1843
+ 1.8199998140335083,
1844
+ 1.8499997854232788,
1845
+ 1.8799997568130493,
1846
+ 1.9099997282028198,
1847
+ 1.9399996995925903,
1848
+ 1.9899996519088745,
1849
+ 2.0199997425079346,
1850
+ 2.0199997425079346,
1851
+ 2.0199997425079346,
1852
+ 2.0199997425079346,
1853
+ 2.0199997425079346,
1854
+ 2.0199997425079346,
1855
+ 2.0299997329711914,
1856
+ 2.0299997329711914,
1857
+ 2.0299997329711914,
1858
+ 2.0299997329711914,
1859
+ 2.0299997329711914,
1860
+ 2.0299997329711914,
1861
+ 2.0299997329711914,
1862
+ 2.0299997329711914,
1863
+ 2.0299997329711914,
1864
+ 2.0799996852874756,
1865
+ 2.0899996757507324,
1866
+ 2.189999580383301,
1867
+ 2.2199995517730713,
1868
+ 2.5899994373321533,
1869
+ 2.729999542236328,
1870
+ 2.749999523162842,
1871
+ 2.8399994373321533
1872
+ ],
1873
+ "type": "longrope"
1874
+ },
1875
+ "rope_theta": 10000.0,
1876
+ "sliding_window": 262144,
1877
+ "tie_word_embeddings": false,
1878
+ "torch_dtype": "bfloat16",
1879
+ "transformers_version": "4.50.0",
1880
+ "use_cache": true,
1881
+ "vocab_size": 32064
1882
+ }
1883
+
1884
+ [2025-03-24 20:33:17,890][__main__][INFO] - Fine Tuning Finished.
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ea7627393a71896e8eef3332b659e9813e5ec55ab39db76b3f30af4e328bea50
3
+ size 5432