jacobpol commited on
Commit
462e674
·
verified ·
1 Parent(s): 439f24d

Delete commodity_lora/checkpoint-800

Browse files
commodity_lora/checkpoint-800/README.md DELETED
@@ -1,207 +0,0 @@
1
- ---
2
- base_model: Qwen/Qwen3-4B-Instruct-2507
3
- library_name: peft
4
- pipeline_tag: text-generation
5
- tags:
6
- - base_model:adapter:Qwen/Qwen3-4B-Instruct-2507
7
- - lora
8
- - transformers
9
- ---
10
-
11
- # Model Card for Model ID
12
-
13
- <!-- Provide a quick summary of what the model is/does. -->
14
-
15
-
16
-
17
- ## Model Details
18
-
19
- ### Model Description
20
-
21
- <!-- Provide a longer summary of what this model is. -->
22
-
23
-
24
-
25
- - **Developed by:** [More Information Needed]
26
- - **Funded by [optional]:** [More Information Needed]
27
- - **Shared by [optional]:** [More Information Needed]
28
- - **Model type:** [More Information Needed]
29
- - **Language(s) (NLP):** [More Information Needed]
30
- - **License:** [More Information Needed]
31
- - **Finetuned from model [optional]:** [More Information Needed]
32
-
33
- ### Model Sources [optional]
34
-
35
- <!-- Provide the basic links for the model. -->
36
-
37
- - **Repository:** [More Information Needed]
38
- - **Paper [optional]:** [More Information Needed]
39
- - **Demo [optional]:** [More Information Needed]
40
-
41
- ## Uses
42
-
43
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
44
-
45
- ### Direct Use
46
-
47
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
48
-
49
- [More Information Needed]
50
-
51
- ### Downstream Use [optional]
52
-
53
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
54
-
55
- [More Information Needed]
56
-
57
- ### Out-of-Scope Use
58
-
59
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
60
-
61
- [More Information Needed]
62
-
63
- ## Bias, Risks, and Limitations
64
-
65
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
66
-
67
- [More Information Needed]
68
-
69
- ### Recommendations
70
-
71
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
72
-
73
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
74
-
75
- ## How to Get Started with the Model
76
-
77
- Use the code below to get started with the model.
78
-
79
- [More Information Needed]
80
-
81
- ## Training Details
82
-
83
- ### Training Data
84
-
85
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
86
-
87
- [More Information Needed]
88
-
89
- ### Training Procedure
90
-
91
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
92
-
93
- #### Preprocessing [optional]
94
-
95
- [More Information Needed]
96
-
97
-
98
- #### Training Hyperparameters
99
-
100
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
101
-
102
- #### Speeds, Sizes, Times [optional]
103
-
104
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
105
-
106
- [More Information Needed]
107
-
108
- ## Evaluation
109
-
110
- <!-- This section describes the evaluation protocols and provides the results. -->
111
-
112
- ### Testing Data, Factors & Metrics
113
-
114
- #### Testing Data
115
-
116
- <!-- This should link to a Dataset Card if possible. -->
117
-
118
- [More Information Needed]
119
-
120
- #### Factors
121
-
122
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
123
-
124
- [More Information Needed]
125
-
126
- #### Metrics
127
-
128
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
129
-
130
- [More Information Needed]
131
-
132
- ### Results
133
-
134
- [More Information Needed]
135
-
136
- #### Summary
137
-
138
-
139
-
140
- ## Model Examination [optional]
141
-
142
- <!-- Relevant interpretability work for the model goes here -->
143
-
144
- [More Information Needed]
145
-
146
- ## Environmental Impact
147
-
148
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
149
-
150
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
151
-
152
- - **Hardware Type:** [More Information Needed]
153
- - **Hours used:** [More Information Needed]
154
- - **Cloud Provider:** [More Information Needed]
155
- - **Compute Region:** [More Information Needed]
156
- - **Carbon Emitted:** [More Information Needed]
157
-
158
- ## Technical Specifications [optional]
159
-
160
- ### Model Architecture and Objective
161
-
162
- [More Information Needed]
163
-
164
- ### Compute Infrastructure
165
-
166
- [More Information Needed]
167
-
168
- #### Hardware
169
-
170
- [More Information Needed]
171
-
172
- #### Software
173
-
174
- [More Information Needed]
175
-
176
- ## Citation [optional]
177
-
178
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
179
-
180
- **BibTeX:**
181
-
182
- [More Information Needed]
183
-
184
- **APA:**
185
-
186
- [More Information Needed]
187
-
188
- ## Glossary [optional]
189
-
190
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
191
-
192
- [More Information Needed]
193
-
194
- ## More Information [optional]
195
-
196
- [More Information Needed]
197
-
198
- ## Model Card Authors [optional]
199
-
200
- [More Information Needed]
201
-
202
- ## Model Card Contact
203
-
204
- [More Information Needed]
205
- ### Framework versions
206
-
207
- - PEFT 0.18.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
commodity_lora/checkpoint-800/adapter_config.json DELETED
@@ -1,46 +0,0 @@
1
- {
2
- "alora_invocation_tokens": null,
3
- "alpha_pattern": {},
4
- "arrow_config": null,
5
- "auto_mapping": null,
6
- "base_model_name_or_path": "Qwen/Qwen3-4B-Instruct-2507",
7
- "bias": "none",
8
- "corda_config": null,
9
- "ensure_weight_tying": false,
10
- "eva_config": null,
11
- "exclude_modules": null,
12
- "fan_in_fan_out": false,
13
- "inference_mode": true,
14
- "init_lora_weights": true,
15
- "layer_replication": null,
16
- "layers_pattern": null,
17
- "layers_to_transform": null,
18
- "loftq_config": {},
19
- "lora_alpha": 16,
20
- "lora_bias": false,
21
- "lora_dropout": 0.05,
22
- "megatron_config": null,
23
- "megatron_core": "megatron.core",
24
- "modules_to_save": null,
25
- "peft_type": "LORA",
26
- "peft_version": "0.18.1",
27
- "qalora_group_size": 16,
28
- "r": 64,
29
- "rank_pattern": {},
30
- "revision": null,
31
- "target_modules": [
32
- "down_proj",
33
- "v_proj",
34
- "k_proj",
35
- "q_proj",
36
- "up_proj",
37
- "o_proj",
38
- "gate_proj"
39
- ],
40
- "target_parameters": null,
41
- "task_type": "CAUSAL_LM",
42
- "trainable_token_indices": null,
43
- "use_dora": false,
44
- "use_qalora": false,
45
- "use_rslora": false
46
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
commodity_lora/checkpoint-800/adapter_model.safetensors DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:283c67b762b2ff71e967897e2f604fc75f16d8bf324cf7b7c841d4ded038daff
3
- size 528550256
 
 
 
 
commodity_lora/checkpoint-800/added_tokens.json DELETED
@@ -1,28 +0,0 @@
1
- {
2
- "</think>": 151668,
3
- "</tool_call>": 151658,
4
- "</tool_response>": 151666,
5
- "<think>": 151667,
6
- "<tool_call>": 151657,
7
- "<tool_response>": 151665,
8
- "<|box_end|>": 151649,
9
- "<|box_start|>": 151648,
10
- "<|endoftext|>": 151643,
11
- "<|file_sep|>": 151664,
12
- "<|fim_middle|>": 151660,
13
- "<|fim_pad|>": 151662,
14
- "<|fim_prefix|>": 151659,
15
- "<|fim_suffix|>": 151661,
16
- "<|im_end|>": 151645,
17
- "<|im_start|>": 151644,
18
- "<|image_pad|>": 151655,
19
- "<|object_ref_end|>": 151647,
20
- "<|object_ref_start|>": 151646,
21
- "<|quad_end|>": 151651,
22
- "<|quad_start|>": 151650,
23
- "<|repo_name|>": 151663,
24
- "<|video_pad|>": 151656,
25
- "<|vision_end|>": 151653,
26
- "<|vision_pad|>": 151654,
27
- "<|vision_start|>": 151652
28
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
commodity_lora/checkpoint-800/merges.txt DELETED
The diff for this file is too large to render. See raw diff
 
commodity_lora/checkpoint-800/optimizer.pt DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:ae2335beab13d0b5d6269116281c42fba0d4ec19de464fe6e54cb74c216923b5
3
- size 1057390923
 
 
 
 
commodity_lora/checkpoint-800/rng_state.pth DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:66ec09f675a6aa8635b98da1729c86cc6f7857c570c02142bc0d3915d10737da
3
- size 14645
 
 
 
 
commodity_lora/checkpoint-800/scheduler.pt DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:1165dc2ed4d844ab3ea5cdb45650f45d9a46d7bbdac2b8690ed3351cb6e73aac
3
- size 1465
 
 
 
 
commodity_lora/checkpoint-800/special_tokens_map.json DELETED
@@ -1,31 +0,0 @@
1
- {
2
- "additional_special_tokens": [
3
- "<|im_start|>",
4
- "<|im_end|>",
5
- "<|object_ref_start|>",
6
- "<|object_ref_end|>",
7
- "<|box_start|>",
8
- "<|box_end|>",
9
- "<|quad_start|>",
10
- "<|quad_end|>",
11
- "<|vision_start|>",
12
- "<|vision_end|>",
13
- "<|vision_pad|>",
14
- "<|image_pad|>",
15
- "<|video_pad|>"
16
- ],
17
- "eos_token": {
18
- "content": "<|im_end|>",
19
- "lstrip": false,
20
- "normalized": false,
21
- "rstrip": false,
22
- "single_word": false
23
- },
24
- "pad_token": {
25
- "content": "<|endoftext|>",
26
- "lstrip": false,
27
- "normalized": false,
28
- "rstrip": false,
29
- "single_word": false
30
- }
31
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
commodity_lora/checkpoint-800/tokenizer.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:2f1298e298f2fe0059aba46f037697a339ccba45a1908780ce8ca14b45582f23
3
- size 11422753
 
 
 
 
commodity_lora/checkpoint-800/tokenizer_config.json DELETED
@@ -1,240 +0,0 @@
1
- {
2
- "add_bos_token": false,
3
- "add_prefix_space": false,
4
- "added_tokens_decoder": {
5
- "151643": {
6
- "content": "<|endoftext|>",
7
- "lstrip": false,
8
- "normalized": false,
9
- "rstrip": false,
10
- "single_word": false,
11
- "special": true
12
- },
13
- "151644": {
14
- "content": "<|im_start|>",
15
- "lstrip": false,
16
- "normalized": false,
17
- "rstrip": false,
18
- "single_word": false,
19
- "special": true
20
- },
21
- "151645": {
22
- "content": "<|im_end|>",
23
- "lstrip": false,
24
- "normalized": false,
25
- "rstrip": false,
26
- "single_word": false,
27
- "special": true
28
- },
29
- "151646": {
30
- "content": "<|object_ref_start|>",
31
- "lstrip": false,
32
- "normalized": false,
33
- "rstrip": false,
34
- "single_word": false,
35
- "special": true
36
- },
37
- "151647": {
38
- "content": "<|object_ref_end|>",
39
- "lstrip": false,
40
- "normalized": false,
41
- "rstrip": false,
42
- "single_word": false,
43
- "special": true
44
- },
45
- "151648": {
46
- "content": "<|box_start|>",
47
- "lstrip": false,
48
- "normalized": false,
49
- "rstrip": false,
50
- "single_word": false,
51
- "special": true
52
- },
53
- "151649": {
54
- "content": "<|box_end|>",
55
- "lstrip": false,
56
- "normalized": false,
57
- "rstrip": false,
58
- "single_word": false,
59
- "special": true
60
- },
61
- "151650": {
62
- "content": "<|quad_start|>",
63
- "lstrip": false,
64
- "normalized": false,
65
- "rstrip": false,
66
- "single_word": false,
67
- "special": true
68
- },
69
- "151651": {
70
- "content": "<|quad_end|>",
71
- "lstrip": false,
72
- "normalized": false,
73
- "rstrip": false,
74
- "single_word": false,
75
- "special": true
76
- },
77
- "151652": {
78
- "content": "<|vision_start|>",
79
- "lstrip": false,
80
- "normalized": false,
81
- "rstrip": false,
82
- "single_word": false,
83
- "special": true
84
- },
85
- "151653": {
86
- "content": "<|vision_end|>",
87
- "lstrip": false,
88
- "normalized": false,
89
- "rstrip": false,
90
- "single_word": false,
91
- "special": true
92
- },
93
- "151654": {
94
- "content": "<|vision_pad|>",
95
- "lstrip": false,
96
- "normalized": false,
97
- "rstrip": false,
98
- "single_word": false,
99
- "special": true
100
- },
101
- "151655": {
102
- "content": "<|image_pad|>",
103
- "lstrip": false,
104
- "normalized": false,
105
- "rstrip": false,
106
- "single_word": false,
107
- "special": true
108
- },
109
- "151656": {
110
- "content": "<|video_pad|>",
111
- "lstrip": false,
112
- "normalized": false,
113
- "rstrip": false,
114
- "single_word": false,
115
- "special": true
116
- },
117
- "151657": {
118
- "content": "<tool_call>",
119
- "lstrip": false,
120
- "normalized": false,
121
- "rstrip": false,
122
- "single_word": false,
123
- "special": false
124
- },
125
- "151658": {
126
- "content": "</tool_call>",
127
- "lstrip": false,
128
- "normalized": false,
129
- "rstrip": false,
130
- "single_word": false,
131
- "special": false
132
- },
133
- "151659": {
134
- "content": "<|fim_prefix|>",
135
- "lstrip": false,
136
- "normalized": false,
137
- "rstrip": false,
138
- "single_word": false,
139
- "special": false
140
- },
141
- "151660": {
142
- "content": "<|fim_middle|>",
143
- "lstrip": false,
144
- "normalized": false,
145
- "rstrip": false,
146
- "single_word": false,
147
- "special": false
148
- },
149
- "151661": {
150
- "content": "<|fim_suffix|>",
151
- "lstrip": false,
152
- "normalized": false,
153
- "rstrip": false,
154
- "single_word": false,
155
- "special": false
156
- },
157
- "151662": {
158
- "content": "<|fim_pad|>",
159
- "lstrip": false,
160
- "normalized": false,
161
- "rstrip": false,
162
- "single_word": false,
163
- "special": false
164
- },
165
- "151663": {
166
- "content": "<|repo_name|>",
167
- "lstrip": false,
168
- "normalized": false,
169
- "rstrip": false,
170
- "single_word": false,
171
- "special": false
172
- },
173
- "151664": {
174
- "content": "<|file_sep|>",
175
- "lstrip": false,
176
- "normalized": false,
177
- "rstrip": false,
178
- "single_word": false,
179
- "special": false
180
- },
181
- "151665": {
182
- "content": "<tool_response>",
183
- "lstrip": false,
184
- "normalized": false,
185
- "rstrip": false,
186
- "single_word": false,
187
- "special": false
188
- },
189
- "151666": {
190
- "content": "</tool_response>",
191
- "lstrip": false,
192
- "normalized": false,
193
- "rstrip": false,
194
- "single_word": false,
195
- "special": false
196
- },
197
- "151667": {
198
- "content": "<think>",
199
- "lstrip": false,
200
- "normalized": false,
201
- "rstrip": false,
202
- "single_word": false,
203
- "special": false
204
- },
205
- "151668": {
206
- "content": "</think>",
207
- "lstrip": false,
208
- "normalized": false,
209
- "rstrip": false,
210
- "single_word": false,
211
- "special": false
212
- }
213
- },
214
- "additional_special_tokens": [
215
- "<|im_start|>",
216
- "<|im_end|>",
217
- "<|object_ref_start|>",
218
- "<|object_ref_end|>",
219
- "<|box_start|>",
220
- "<|box_end|>",
221
- "<|quad_start|>",
222
- "<|quad_end|>",
223
- "<|vision_start|>",
224
- "<|vision_end|>",
225
- "<|vision_pad|>",
226
- "<|image_pad|>",
227
- "<|video_pad|>"
228
- ],
229
- "bos_token": null,
230
- "chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0].role == 'system' %}\n {{- messages[0].content + '\\n\\n' }}\n {%- endif %}\n {{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0].role == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0].content + '<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n {%- if message.content is string %}\n {%- set content = message.content %}\n {%- else %}\n {%- set content = '' %}\n {%- endif %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) %}\n {{- '<|im_start|>' + message.role + '\\n' + content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- if message.tool_calls %}\n {%- for tool_call in message.tool_calls %}\n {%- if (loop.first and content) or (not loop.first) %}\n {{- '\\n' }}\n {%- endif %}\n {%- if tool_call.function %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {%- if tool_call.arguments is string %}\n {{- tool_call.arguments }}\n {%- else %}\n {{- tool_call.arguments | tojson }}\n {%- endif %}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {%- endif %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}",
231
- "clean_up_tokenization_spaces": false,
232
- "eos_token": "<|im_end|>",
233
- "errors": "replace",
234
- "extra_special_tokens": {},
235
- "model_max_length": 1010000,
236
- "pad_token": "<|endoftext|>",
237
- "split_special_tokens": false,
238
- "tokenizer_class": "Qwen2Tokenizer",
239
- "unk_token": null
240
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
commodity_lora/checkpoint-800/trainer_state.json DELETED
@@ -1,694 +0,0 @@
1
- {
2
- "best_global_step": null,
3
- "best_metric": null,
4
- "best_model_checkpoint": null,
5
- "epoch": 1.4710211591536337,
6
- "eval_steps": 200,
7
- "global_step": 800,
8
- "is_hyper_param_search": false,
9
- "is_local_process_zero": true,
10
- "is_world_process_zero": true,
11
- "log_history": [
12
- {
13
- "epoch": 0.01839926402943882,
14
- "grad_norm": 0.8415894508361816,
15
- "learning_rate": 5.4545454545454546e-05,
16
- "loss": 0.7074,
17
- "step": 10
18
- },
19
- {
20
- "epoch": 0.03679852805887764,
21
- "grad_norm": 0.291018009185791,
22
- "learning_rate": 0.00011515151515151516,
23
- "loss": 0.3417,
24
- "step": 20
25
- },
26
- {
27
- "epoch": 0.05519779208831647,
28
- "grad_norm": 0.1942419707775116,
29
- "learning_rate": 0.00017575757575757578,
30
- "loss": 0.1873,
31
- "step": 30
32
- },
33
- {
34
- "epoch": 0.07359705611775529,
35
- "grad_norm": 0.100527323782444,
36
- "learning_rate": 0.00019998397847281548,
37
- "loss": 0.1741,
38
- "step": 40
39
- },
40
- {
41
- "epoch": 0.09199632014719411,
42
- "grad_norm": 0.18655213713645935,
43
- "learning_rate": 0.0001998860877308941,
44
- "loss": 0.1396,
45
- "step": 50
46
- },
47
- {
48
- "epoch": 0.11039558417663294,
49
- "grad_norm": 0.18256014585494995,
50
- "learning_rate": 0.0001996992941167792,
51
- "loss": 0.1354,
52
- "step": 60
53
- },
54
- {
55
- "epoch": 0.12879484820607176,
56
- "grad_norm": 0.20178623497486115,
57
- "learning_rate": 0.0001994237638847428,
58
- "loss": 0.1199,
59
- "step": 70
60
- },
61
- {
62
- "epoch": 0.14719411223551057,
63
- "grad_norm": 0.14032356441020966,
64
- "learning_rate": 0.00019905974226842613,
65
- "loss": 0.139,
66
- "step": 80
67
- },
68
- {
69
- "epoch": 0.1655933762649494,
70
- "grad_norm": 0.24278348684310913,
71
- "learning_rate": 0.00019860755326257126,
72
- "loss": 0.1095,
73
- "step": 90
74
- },
75
- {
76
- "epoch": 0.18399264029438822,
77
- "grad_norm": 0.10004690289497375,
78
- "learning_rate": 0.000198067599334652,
79
- "loss": 0.1247,
80
- "step": 100
81
- },
82
- {
83
- "epoch": 0.20239190432382706,
84
- "grad_norm": 0.1411207616329193,
85
- "learning_rate": 0.0001974403610666606,
86
- "loss": 0.1187,
87
- "step": 110
88
- },
89
- {
90
- "epoch": 0.22079116835326587,
91
- "grad_norm": 0.20963186025619507,
92
- "learning_rate": 0.00019672639672736945,
93
- "loss": 0.1159,
94
- "step": 120
95
- },
96
- {
97
- "epoch": 0.23919043238270468,
98
- "grad_norm": 0.12421368062496185,
99
- "learning_rate": 0.00019592634177544805,
100
- "loss": 0.1119,
101
- "step": 130
102
- },
103
- {
104
- "epoch": 0.2575896964121435,
105
- "grad_norm": 0.3507830798625946,
106
- "learning_rate": 0.0001950409082938776,
107
- "loss": 0.109,
108
- "step": 140
109
- },
110
- {
111
- "epoch": 0.27598896044158233,
112
- "grad_norm": 0.12496884167194366,
113
- "learning_rate": 0.000194070884356167,
114
- "loss": 0.097,
115
- "step": 150
116
- },
117
- {
118
- "epoch": 0.29438822447102114,
119
- "grad_norm": 0.17028333246707916,
120
- "learning_rate": 0.00019301713332493386,
121
- "loss": 0.085,
122
- "step": 160
123
- },
124
- {
125
- "epoch": 0.31278748850046,
126
- "grad_norm": 0.269582062959671,
127
- "learning_rate": 0.00019188059308347476,
128
- "loss": 0.1003,
129
- "step": 170
130
- },
131
- {
132
- "epoch": 0.3311867525298988,
133
- "grad_norm": 0.14604653418064117,
134
- "learning_rate": 0.00019066227520100952,
135
- "loss": 0.1159,
136
- "step": 180
137
- },
138
- {
139
- "epoch": 0.34958601655933763,
140
- "grad_norm": 0.09594413638114929,
141
- "learning_rate": 0.00018936326403234125,
142
- "loss": 0.1082,
143
- "step": 190
144
- },
145
- {
146
- "epoch": 0.36798528058877644,
147
- "grad_norm": 0.13661913573741913,
148
- "learning_rate": 0.00018798471575273443,
149
- "loss": 0.1092,
150
- "step": 200
151
- },
152
- {
153
- "epoch": 0.36798528058877644,
154
- "eval_loss": 0.059110600501298904,
155
- "eval_runtime": 16.5945,
156
- "eval_samples_per_second": 4.399,
157
- "eval_steps_per_second": 2.23,
158
- "step": 200
159
- },
160
- {
161
- "eval_ner_f1": 0.0,
162
- "step": 200
163
- },
164
- {
165
- "eval_ner_precision": 0.0,
166
- "step": 200
167
- },
168
- {
169
- "eval_ner_recall": 0.0,
170
- "step": 200
171
- },
172
- {
173
- "eval_ner_f1_commodity": 0.0,
174
- "step": 200
175
- },
176
- {
177
- "epoch": 0.38638454461821525,
178
- "grad_norm": 0.1869058758020401,
179
- "learning_rate": 0.00018652785732886987,
180
- "loss": 0.0995,
181
- "step": 210
182
- },
183
- {
184
- "epoch": 0.4047838086476541,
185
- "grad_norm": 0.09933796525001526,
186
- "learning_rate": 0.00018499398542679187,
187
- "loss": 0.0961,
188
- "step": 220
189
- },
190
- {
191
- "epoch": 0.42318307267709293,
192
- "grad_norm": 0.16164903342723846,
193
- "learning_rate": 0.0001833844652578203,
194
- "loss": 0.1147,
195
- "step": 230
196
- },
197
- {
198
- "epoch": 0.44158233670653174,
199
- "grad_norm": 0.059403154999017715,
200
- "learning_rate": 0.0001817007293634545,
201
- "loss": 0.0792,
202
- "step": 240
203
- },
204
- {
205
- "epoch": 0.45998160073597055,
206
- "grad_norm": 0.18221575021743774,
207
- "learning_rate": 0.00017994427634035015,
208
- "loss": 0.0785,
209
- "step": 250
210
- },
211
- {
212
- "epoch": 0.47838086476540936,
213
- "grad_norm": 0.0980747789144516,
214
- "learning_rate": 0.00017811666950650446,
215
- "loss": 0.0945,
216
- "step": 260
217
- },
218
- {
219
- "epoch": 0.49678012879484823,
220
- "grad_norm": 0.10701967030763626,
221
- "learning_rate": 0.00017621953550983675,
222
- "loss": 0.086,
223
- "step": 270
224
- },
225
- {
226
- "epoch": 0.515179392824287,
227
- "grad_norm": 0.14636573195457458,
228
- "learning_rate": 0.00017425456288040235,
229
- "loss": 0.0824,
230
- "step": 280
231
- },
232
- {
233
- "epoch": 0.5335786568537259,
234
- "grad_norm": 0.20424717664718628,
235
- "learning_rate": 0.00017222350052752881,
236
- "loss": 0.083,
237
- "step": 290
238
- },
239
- {
240
- "epoch": 0.5519779208831647,
241
- "grad_norm": 0.18571391701698303,
242
- "learning_rate": 0.00017012815618321257,
243
- "loss": 0.0917,
244
- "step": 300
245
- },
246
- {
247
- "epoch": 0.5703771849126035,
248
- "grad_norm": 0.145646870136261,
249
- "learning_rate": 0.00016797039479315992,
250
- "loss": 0.1055,
251
- "step": 310
252
- },
253
- {
254
- "epoch": 0.5887764489420423,
255
- "grad_norm": 0.13823704421520233,
256
- "learning_rate": 0.0001657521368569064,
257
- "loss": 0.0816,
258
- "step": 320
259
- },
260
- {
261
- "epoch": 0.6071757129714811,
262
- "grad_norm": 0.10453420132398605,
263
- "learning_rate": 0.00016347535671848998,
264
- "loss": 0.086,
265
- "step": 330
266
- },
267
- {
268
- "epoch": 0.62557497700092,
269
- "grad_norm": 0.1144270971417427,
270
- "learning_rate": 0.00016114208080920123,
271
- "loss": 0.0873,
272
- "step": 340
273
- },
274
- {
275
- "epoch": 0.6439742410303588,
276
- "grad_norm": 0.25295525789260864,
277
- "learning_rate": 0.0001587543858439727,
278
- "loss": 0.0959,
279
- "step": 350
280
- },
281
- {
282
- "epoch": 0.6623735050597976,
283
- "grad_norm": 0.20557653903961182,
284
- "learning_rate": 0.00015631439697301464,
285
- "loss": 0.0884,
286
- "step": 360
287
- },
288
- {
289
- "epoch": 0.6807727690892365,
290
- "grad_norm": 0.13607051968574524,
291
- "learning_rate": 0.00015382428589034037,
292
- "loss": 0.1075,
293
- "step": 370
294
- },
295
- {
296
- "epoch": 0.6991720331186753,
297
- "grad_norm": 0.0677134320139885,
298
- "learning_rate": 0.00015128626890086646,
299
- "loss": 0.0853,
300
- "step": 380
301
- },
302
- {
303
- "epoch": 0.7175712971481141,
304
- "grad_norm": 0.13079920411109924,
305
- "learning_rate": 0.0001487026049478068,
306
- "loss": 0.0883,
307
- "step": 390
308
- },
309
- {
310
- "epoch": 0.7359705611775529,
311
- "grad_norm": 0.2688474953174591,
312
- "learning_rate": 0.00014607559360211686,
313
- "loss": 0.098,
314
- "step": 400
315
- },
316
- {
317
- "epoch": 0.7359705611775529,
318
- "eval_loss": 0.09841074794530869,
319
- "eval_runtime": 16.3162,
320
- "eval_samples_per_second": 4.474,
321
- "eval_steps_per_second": 2.268,
322
- "step": 400
323
- },
324
- {
325
- "eval_ner_f1": 0.1904761904761905,
326
- "step": 400
327
- },
328
- {
329
- "eval_ner_precision": 0.10526315789473684,
330
- "step": 400
331
- },
332
- {
333
- "eval_ner_recall": 1.0,
334
- "step": 400
335
- },
336
- {
337
- "eval_ner_f1_commodity": 0.19999999999999998,
338
- "step": 400
339
- },
340
- {
341
- "eval_ner_f1_organization": 0.0,
342
- "step": 400
343
- },
344
- {
345
- "epoch": 0.7543698252069917,
346
- "grad_norm": 0.09586778283119202,
347
- "learning_rate": 0.00014340757301577788,
348
- "loss": 0.0798,
349
- "step": 410
350
- },
351
- {
352
- "epoch": 0.7727690892364305,
353
- "grad_norm": 0.10425524413585663,
354
- "learning_rate": 0.0001407009178407417,
355
- "loss": 0.089,
356
- "step": 420
357
- },
358
- {
359
- "epoch": 0.7911683532658693,
360
- "grad_norm": 0.13633865118026733,
361
- "learning_rate": 0.00013795803711538966,
362
- "loss": 0.1046,
363
- "step": 430
364
- },
365
- {
366
- "epoch": 0.8095676172953082,
367
- "grad_norm": 0.10460830479860306,
368
- "learning_rate": 0.00013518137212038554,
369
- "loss": 0.0748,
370
- "step": 440
371
- },
372
- {
373
- "epoch": 0.827966881324747,
374
- "grad_norm": 0.11895473301410675,
375
- "learning_rate": 0.00013237339420583212,
376
- "loss": 0.0917,
377
- "step": 450
378
- },
379
- {
380
- "epoch": 0.8463661453541859,
381
- "grad_norm": 0.10228480398654938,
382
- "learning_rate": 0.00012953660259166412,
383
- "loss": 0.0754,
384
- "step": 460
385
- },
386
- {
387
- "epoch": 0.8647654093836247,
388
- "grad_norm": 0.15550141036510468,
389
- "learning_rate": 0.00012667352214323614,
390
- "loss": 0.0871,
391
- "step": 470
392
- },
393
- {
394
- "epoch": 0.8831646734130635,
395
- "grad_norm": 0.07745213061571121,
396
- "learning_rate": 0.0001237867011240848,
397
- "loss": 0.0826,
398
- "step": 480
399
- },
400
- {
401
- "epoch": 0.9015639374425023,
402
- "grad_norm": 0.20176838338375092,
403
- "learning_rate": 0.00012087870892786588,
404
- "loss": 0.0849,
405
- "step": 490
406
- },
407
- {
408
- "epoch": 0.9199632014719411,
409
- "grad_norm": 0.14597876369953156,
410
- "learning_rate": 0.00011795213379148436,
411
- "loss": 0.1032,
412
- "step": 500
413
- },
414
- {
415
- "epoch": 0.9383624655013799,
416
- "grad_norm": 0.13878890872001648,
417
- "learning_rate": 0.0001150095804914534,
418
- "loss": 0.0856,
419
- "step": 510
420
- },
421
- {
422
- "epoch": 0.9567617295308187,
423
- "grad_norm": 0.1053905189037323,
424
- "learning_rate": 0.0001120536680255323,
425
- "loss": 0.0786,
426
- "step": 520
427
- },
428
- {
429
- "epoch": 0.9751609935602575,
430
- "grad_norm": 0.2514699399471283,
431
- "learning_rate": 0.00010908702728170705,
432
- "loss": 0.0906,
433
- "step": 530
434
- },
435
- {
436
- "epoch": 0.9935602575896965,
437
- "grad_norm": 0.1553167700767517,
438
- "learning_rate": 0.00010611229869658785,
439
- "loss": 0.0857,
440
- "step": 540
441
- },
442
- {
443
- "epoch": 1.0110395584176632,
444
- "grad_norm": 0.09016945213079453,
445
- "learning_rate": 0.00010313212990530803,
446
- "loss": 0.0775,
447
- "step": 550
448
- },
449
- {
450
- "epoch": 1.029438822447102,
451
- "grad_norm": 0.11983165889978409,
452
- "learning_rate": 0.00010014917338501618,
453
- "loss": 0.0704,
454
- "step": 560
455
- },
456
- {
457
- "epoch": 1.0478380864765409,
458
- "grad_norm": 0.10803485661745071,
459
- "learning_rate": 9.716608409405842e-05,
460
- "loss": 0.0652,
461
- "step": 570
462
- },
463
- {
464
- "epoch": 1.0662373505059797,
465
- "grad_norm": 0.08216193318367004,
466
- "learning_rate": 9.418551710895243e-05,
467
- "loss": 0.0453,
468
- "step": 580
469
- },
470
- {
471
- "epoch": 1.0846366145354185,
472
- "grad_norm": 0.21697314083576202,
473
- "learning_rate": 9.121012526125626e-05,
474
- "loss": 0.0554,
475
- "step": 590
476
- },
477
- {
478
- "epoch": 1.1030358785648575,
479
- "grad_norm": 0.09792917966842651,
480
- "learning_rate": 8.824255677643518e-05,
481
- "loss": 0.0711,
482
- "step": 600
483
- },
484
- {
485
- "epoch": 1.1030358785648575,
486
- "eval_loss": 0.07825087010860443,
487
- "eval_runtime": 16.4391,
488
- "eval_samples_per_second": 4.441,
489
- "eval_steps_per_second": 2.251,
490
- "step": 600
491
- },
492
- {
493
- "eval_ner_f1": 0.19999999999999998,
494
- "step": 600
495
- },
496
- {
497
- "eval_ner_precision": 0.1111111111111111,
498
- "step": 600
499
- },
500
- {
501
- "eval_ner_recall": 1.0,
502
- "step": 600
503
- },
504
- {
505
- "eval_ner_f1_commodity": 0.19999999999999998,
506
- "step": 600
507
- },
508
- {
509
- "epoch": 1.1214351425942963,
510
- "grad_norm": 0.12685388326644897,
511
- "learning_rate": 8.528545291682838e-05,
512
- "loss": 0.0538,
513
- "step": 610
514
- },
515
- {
516
- "epoch": 1.1398344066237351,
517
- "grad_norm": 0.04482072964310646,
518
- "learning_rate": 8.2341445630813e-05,
519
- "loss": 0.058,
520
- "step": 620
521
- },
522
- {
523
- "epoch": 1.158233670653174,
524
- "grad_norm": 0.10546564310789108,
525
- "learning_rate": 7.941315521025775e-05,
526
- "loss": 0.0612,
527
- "step": 630
528
- },
529
- {
530
- "epoch": 1.1766329346826128,
531
- "grad_norm": 0.14681607484817505,
532
- "learning_rate": 7.650318795835179e-05,
533
- "loss": 0.0574,
534
- "step": 640
535
- },
536
- {
537
- "epoch": 1.1950321987120516,
538
- "grad_norm": 0.10861357301473618,
539
- "learning_rate": 7.361413386988378e-05,
540
- "loss": 0.0556,
541
- "step": 650
542
- },
543
- {
544
- "epoch": 1.2134314627414904,
545
- "grad_norm": 0.07452739030122757,
546
- "learning_rate": 7.074856432603628e-05,
547
- "loss": 0.0586,
548
- "step": 660
549
- },
550
- {
551
- "epoch": 1.2318307267709292,
552
- "grad_norm": 0.09327604621648788,
553
- "learning_rate": 6.790902980574685e-05,
554
- "loss": 0.0669,
555
- "step": 670
556
- },
557
- {
558
- "epoch": 1.250229990800368,
559
- "grad_norm": 0.15776699781417847,
560
- "learning_rate": 6.509805761567336e-05,
561
- "loss": 0.0543,
562
- "step": 680
563
- },
564
- {
565
- "epoch": 1.2686292548298068,
566
- "grad_norm": 0.1362597495317459,
567
- "learning_rate": 6.231814964078327e-05,
568
- "loss": 0.062,
569
- "step": 690
570
- },
571
- {
572
- "epoch": 1.2870285188592456,
573
- "grad_norm": 0.1402318924665451,
574
- "learning_rate": 5.957178011756952e-05,
575
- "loss": 0.0684,
576
- "step": 700
577
- },
578
- {
579
- "epoch": 1.3054277828886844,
580
- "grad_norm": 0.09126363694667816,
581
- "learning_rate": 5.6861393431874675e-05,
582
- "loss": 0.066,
583
- "step": 710
584
- },
585
- {
586
- "epoch": 1.3238270469181233,
587
- "grad_norm": 0.10861663520336151,
588
- "learning_rate": 5.418940194328344e-05,
589
- "loss": 0.0595,
590
- "step": 720
591
- },
592
- {
593
- "epoch": 1.342226310947562,
594
- "grad_norm": 0.10028150677680969,
595
- "learning_rate": 5.1558183838019755e-05,
596
- "loss": 0.0526,
597
- "step": 730
598
- },
599
- {
600
- "epoch": 1.3606255749770009,
601
- "grad_norm": 0.12535513937473297,
602
- "learning_rate": 4.897008101226002e-05,
603
- "loss": 0.0544,
604
- "step": 740
605
- },
606
- {
607
- "epoch": 1.3790248390064397,
608
- "grad_norm": 0.0817708894610405,
609
- "learning_rate": 4.6427396987745555e-05,
610
- "loss": 0.067,
611
- "step": 750
612
- },
613
- {
614
- "epoch": 1.3974241030358785,
615
- "grad_norm": 0.12001190334558487,
616
- "learning_rate": 4.3932394861550106e-05,
617
- "loss": 0.0577,
618
- "step": 760
619
- },
620
- {
621
- "epoch": 1.4158233670653173,
622
- "grad_norm": 0.13876648247241974,
623
- "learning_rate": 4.148729529182736e-05,
624
- "loss": 0.0678,
625
- "step": 770
626
- },
627
- {
628
- "epoch": 1.4342226310947561,
629
- "grad_norm": 0.07272978127002716,
630
- "learning_rate": 3.909427452133016e-05,
631
- "loss": 0.0654,
632
- "step": 780
633
- },
634
- {
635
- "epoch": 1.452621895124195,
636
- "grad_norm": 0.11736124753952026,
637
- "learning_rate": 3.675546244046228e-05,
638
- "loss": 0.0683,
639
- "step": 790
640
- },
641
- {
642
- "epoch": 1.4710211591536337,
643
- "grad_norm": 0.09826095402240753,
644
- "learning_rate": 3.447294069158481e-05,
645
- "loss": 0.0502,
646
- "step": 800
647
- },
648
- {
649
- "epoch": 1.4710211591536337,
650
- "eval_loss": 0.060979194939136505,
651
- "eval_runtime": 16.5781,
652
- "eval_samples_per_second": 4.403,
653
- "eval_steps_per_second": 2.232,
654
- "step": 800
655
- },
656
- {
657
- "eval_ner_f1": 0.2857142857142857,
658
- "step": 800
659
- },
660
- {
661
- "eval_ner_precision": 0.16666666666666666,
662
- "step": 800
663
- },
664
- {
665
- "eval_ner_recall": 1.0,
666
- "step": 800
667
- },
668
- {
669
- "eval_ner_f1_commodity": 0.2857142857142857,
670
- "step": 800
671
- }
672
- ],
673
- "logging_steps": 10,
674
- "max_steps": 1086,
675
- "num_input_tokens_seen": 0,
676
- "num_train_epochs": 2,
677
- "save_steps": 200,
678
- "stateful_callbacks": {
679
- "TrainerControl": {
680
- "args": {
681
- "should_epoch_stop": false,
682
- "should_evaluate": false,
683
- "should_log": false,
684
- "should_save": true,
685
- "should_training_stop": false
686
- },
687
- "attributes": {}
688
- }
689
- },
690
- "total_flos": 1.254286399655854e+17,
691
- "train_batch_size": 2,
692
- "trial_name": null,
693
- "trial_params": null
694
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
commodity_lora/checkpoint-800/training_args.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:e204976f075a0ecfdeaf56eff02c4003fc485d15458b651f4d8199e48cedadd5
3
- size 5713
 
 
 
 
commodity_lora/checkpoint-800/vocab.json DELETED
The diff for this file is too large to render. See raw diff