Delete person_lora/checkpoint-1224

Browse files

Files changed (14) hide show

person_lora/checkpoint-1224/README.md +0 -207
person_lora/checkpoint-1224/adapter_config.json +0 -46
person_lora/checkpoint-1224/adapter_model.safetensors +0 -3
person_lora/checkpoint-1224/added_tokens.json +0 -28
person_lora/checkpoint-1224/merges.txt +0 -0
person_lora/checkpoint-1224/optimizer.pt +0 -3
person_lora/checkpoint-1224/rng_state.pth +0 -3
person_lora/checkpoint-1224/scheduler.pt +0 -3
person_lora/checkpoint-1224/special_tokens_map.json +0 -31
person_lora/checkpoint-1224/tokenizer.json +0 -3
person_lora/checkpoint-1224/tokenizer_config.json +0 -240
person_lora/checkpoint-1224/trainer_state.json +0 -1036
person_lora/checkpoint-1224/training_args.bin +0 -3
person_lora/checkpoint-1224/vocab.json +0 -0

person_lora/checkpoint-1224/README.md DELETED Viewed

@@ -1,207 +0,0 @@
----
-base_model: Qwen/Qwen3-4B-Instruct-2507
-library_name: peft
-pipeline_tag: text-generation
-tags:
-- base_model:adapter:Qwen/Qwen3-4B-Instruct-2507
-- lora
-- transformers
----
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]
-### Framework versions
-- PEFT 0.18.1

person_lora/checkpoint-1224/adapter_config.json DELETED Viewed

@@ -1,46 +0,0 @@
-{
-  "alora_invocation_tokens": null,
-  "alpha_pattern": {},
-  "arrow_config": null,
-  "auto_mapping": null,
-  "base_model_name_or_path": "Qwen/Qwen3-4B-Instruct-2507",
-  "bias": "none",
-  "corda_config": null,
-  "ensure_weight_tying": false,
-  "eva_config": null,
-  "exclude_modules": null,
-  "fan_in_fan_out": false,
-  "inference_mode": true,
-  "init_lora_weights": true,
-  "layer_replication": null,
-  "layers_pattern": null,
-  "layers_to_transform": null,
-  "loftq_config": {},
-  "lora_alpha": 16,
-  "lora_bias": false,
-  "lora_dropout": 0.05,
-  "megatron_config": null,
-  "megatron_core": "megatron.core",
-  "modules_to_save": null,
-  "peft_type": "LORA",
-  "peft_version": "0.18.1",
-  "qalora_group_size": 16,
-  "r": 64,
-  "rank_pattern": {},
-  "revision": null,
-  "target_modules": [
-    "o_proj",
-    "v_proj",
-    "q_proj",
-    "gate_proj",
-    "k_proj",
-    "up_proj",
-    "down_proj"
-  ],
-  "target_parameters": null,
-  "task_type": "CAUSAL_LM",
-  "trainable_token_indices": null,
-  "use_dora": false,
-  "use_qalora": false,
-  "use_rslora": false
-}

person_lora/checkpoint-1224/adapter_model.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:9e462d70d1737df420c59a95a85eecb7a98820e8686f9394e7b35e1332b259cc
-size 528550256

person_lora/checkpoint-1224/added_tokens.json DELETED Viewed

@@ -1,28 +0,0 @@
-{
-  "</think>": 151668,
-  "</tool_call>": 151658,
-  "</tool_response>": 151666,
-  "<think>": 151667,
-  "<tool_call>": 151657,
-  "<tool_response>": 151665,
-  "<|box_end|>": 151649,
-  "<|box_start|>": 151648,
-  "<|endoftext|>": 151643,
-  "<|file_sep|>": 151664,
-  "<|fim_middle|>": 151660,
-  "<|fim_pad|>": 151662,
-  "<|fim_prefix|>": 151659,
-  "<|fim_suffix|>": 151661,
-  "<|im_end|>": 151645,
-  "<|im_start|>": 151644,
-  "<|image_pad|>": 151655,
-  "<|object_ref_end|>": 151647,
-  "<|object_ref_start|>": 151646,
-  "<|quad_end|>": 151651,
-  "<|quad_start|>": 151650,
-  "<|repo_name|>": 151663,
-  "<|video_pad|>": 151656,
-  "<|vision_end|>": 151653,
-  "<|vision_pad|>": 151654,
-  "<|vision_start|>": 151652
-}

person_lora/checkpoint-1224/merges.txt DELETED Viewed

The diff for this file is too large to render. See raw diff

person_lora/checkpoint-1224/optimizer.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:b9889e82104ffb5dbd1e64acdbdcb75b0ccf7d682b4fc7b15b57005d0a66d166
-size 1057390923

person_lora/checkpoint-1224/rng_state.pth DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:43ea5b16363545b09472729b7073fe5c4a5944f59878facbc92d227389518462
-size 14645

person_lora/checkpoint-1224/scheduler.pt DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:6a956ea83f52d68346214fb4c1400f4629bb44b9f3494ea151b52b5e43015afd
-size 1465

person_lora/checkpoint-1224/special_tokens_map.json DELETED Viewed

@@ -1,31 +0,0 @@
-{
-  "additional_special_tokens": [
-    "<|im_start|>",
-    "<|im_end|>",
-    "<|object_ref_start|>",
-    "<|object_ref_end|>",
-    "<|box_start|>",
-    "<|box_end|>",
-    "<|quad_start|>",
-    "<|quad_end|>",
-    "<|vision_start|>",
-    "<|vision_end|>",
-    "<|vision_pad|>",
-    "<|image_pad|>",
-    "<|video_pad|>"
-  ],
-  "eos_token": {
-    "content": "<|im_end|>",
-    "lstrip": false,
-    "normalized": false,
-    "rstrip": false,
-    "single_word": false
-  },
-  "pad_token": {
-    "content": "<|endoftext|>",
-    "lstrip": false,
-    "normalized": false,
-    "rstrip": false,
-    "single_word": false
-  }
-}

person_lora/checkpoint-1224/tokenizer.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:2f1298e298f2fe0059aba46f037697a339ccba45a1908780ce8ca14b45582f23
-size 11422753

person_lora/checkpoint-1224/tokenizer_config.json DELETED Viewed

@@ -1,240 +0,0 @@
-{
-  "add_bos_token": false,
-  "add_prefix_space": false,
-  "added_tokens_decoder": {
-    "151643": {
-      "content": "<|endoftext|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151644": {
-      "content": "<|im_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151645": {
-      "content": "<|im_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151646": {
-      "content": "<|object_ref_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151647": {
-      "content": "<|object_ref_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151648": {
-      "content": "<|box_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151649": {
-      "content": "<|box_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151650": {
-      "content": "<|quad_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151651": {
-      "content": "<|quad_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151652": {
-      "content": "<|vision_start|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151653": {
-      "content": "<|vision_end|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151654": {
-      "content": "<|vision_pad|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151655": {
-      "content": "<|image_pad|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151656": {
-      "content": "<|video_pad|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": true
-    },
-    "151657": {
-      "content": "<tool_call>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151658": {
-      "content": "</tool_call>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151659": {
-      "content": "<|fim_prefix|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151660": {
-      "content": "<|fim_middle|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151661": {
-      "content": "<|fim_suffix|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151662": {
-      "content": "<|fim_pad|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151663": {
-      "content": "<|repo_name|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151664": {
-      "content": "<|file_sep|>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151665": {
-      "content": "<tool_response>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151666": {
-      "content": "</tool_response>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151667": {
-      "content": "<think>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    },
-    "151668": {
-      "content": "</think>",
-      "lstrip": false,
-      "normalized": false,
-      "rstrip": false,
-      "single_word": false,
-      "special": false
-    }
-  },
-  "additional_special_tokens": [
-    "<|im_start|>",
-    "<|im_end|>",
-    "<|object_ref_start|>",
-    "<|object_ref_end|>",
-    "<|box_start|>",
-    "<|box_end|>",
-    "<|quad_start|>",
-    "<|quad_end|>",
-    "<|vision_start|>",
-    "<|vision_end|>",
-    "<|vision_pad|>",
-    "<|image_pad|>",
-    "<|video_pad|>"
-  ],
-  "bos_token": null,
-  "chat_template": "{%- if tools %}\n    {{- '<|im_start|>system\\n' }}\n    {%- if messages[0].role == 'system' %}\n        {{- messages[0].content + '\\n\\n' }}\n    {%- endif %}\n    {{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n    {%- for tool in tools %}\n        {{- \"\\n\" }}\n        {{- tool | tojson }}\n    {%- endfor %}\n    {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n    {%- if messages[0].role == 'system' %}\n        {{- '<|im_start|>system\\n' + messages[0].content + '<|im_end|>\\n' }}\n    {%- endif %}\n{%- endif %}\n{%- for message in messages %}\n    {%- if message.content is string %}\n        {%- set content = message.content %}\n    {%- else %}\n        {%- set content = '' %}\n    {%- endif %}\n    {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) %}\n        {{- '<|im_start|>' + message.role + '\\n' + content + '<|im_end|>' + '\\n' }}\n    {%- elif message.role == \"assistant\" %}\n        {{- '<|im_start|>' + message.role + '\\n' + content }}\n        {%- if message.tool_calls %}\n            {%- for tool_call in message.tool_calls %}\n                {%- if (loop.first and content) or (not loop.first) %}\n                    {{- '\\n' }}\n                {%- endif %}\n                {%- if tool_call.function %}\n                    {%- set tool_call = tool_call.function %}\n                {%- endif %}\n                {{- '<tool_call>\\n{\"name\": \"' }}\n                {{- tool_call.name }}\n                {{- '\", \"arguments\": ' }}\n                {%- if tool_call.arguments is string %}\n                    {{- tool_call.arguments }}\n                {%- else %}\n                    {{- tool_call.arguments | tojson }}\n                {%- endif %}\n                {{- '}\\n</tool_call>' }}\n            {%- endfor %}\n        {%- endif %}\n        {{- '<|im_end|>\\n' }}\n    {%- elif message.role == \"tool\" %}\n        {%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n            {{- '<|im_start|>user' }}\n        {%- endif %}\n        {{- '\\n<tool_response>\\n' }}\n        {{- content }}\n        {{- '\\n</tool_response>' }}\n        {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n            {{- '<|im_end|>\\n' }}\n        {%- endif %}\n    {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n    {{- '<|im_start|>assistant\\n' }}\n{%- endif %}",
-  "clean_up_tokenization_spaces": false,
-  "eos_token": "<|im_end|>",
-  "errors": "replace",
-  "extra_special_tokens": {},
-  "model_max_length": 1010000,
-  "pad_token": "<|endoftext|>",
-  "split_special_tokens": false,
-  "tokenizer_class": "Qwen2Tokenizer",
-  "unk_token": null
-}

person_lora/checkpoint-1224/trainer_state.json DELETED Viewed

@@ -1,1036 +0,0 @@
-{
-  "best_global_step": null,
-  "best_metric": null,
-  "best_model_checkpoint": null,
-  "epoch": 4.0,
-  "eval_steps": 200,
-  "global_step": 1224,
-  "is_hyper_param_search": false,
-  "is_local_process_zero": true,
-  "is_world_process_zero": true,
-  "log_history": [
-    {
-      "epoch": 0.032679738562091505,
-      "grad_norm": 1.453621506690979,
-      "learning_rate": 4.8648648648648654e-05,
-      "loss": 0.6269,
-      "step": 10
-    },
-    {
-      "epoch": 0.06535947712418301,
-      "grad_norm": 0.7020823955535889,
-      "learning_rate": 0.0001027027027027027,
-      "loss": 0.2545,
-      "step": 20
-    },
-    {
-      "epoch": 0.09803921568627451,
-      "grad_norm": 0.2977686822414398,
-      "learning_rate": 0.00015675675675675676,
-      "loss": 0.1214,
-      "step": 30
-    },
-    {
-      "epoch": 0.13071895424836602,
-      "grad_norm": 0.11059613525867462,
-      "learning_rate": 0.00019999859903498856,
-      "loss": 0.1248,
-      "step": 40
-    },
-    {
-      "epoch": 0.16339869281045752,
-      "grad_norm": 0.20021583139896393,
-      "learning_rate": 0.00019994956938114075,
-      "loss": 0.1142,
-      "step": 50
-    },
-    {
-      "epoch": 0.19607843137254902,
-      "grad_norm": 0.21602283418178558,
-      "learning_rate": 0.00019983053072583596,
-      "loss": 0.1006,
-      "step": 60
-    },
-    {
-      "epoch": 0.22875816993464052,
-      "grad_norm": 0.1741652488708496,
-      "learning_rate": 0.00019964156644889706,
-      "loss": 0.098,
-      "step": 70
-    },
-    {
-      "epoch": 0.26143790849673204,
-      "grad_norm": 0.20578978955745697,
-      "learning_rate": 0.0001993828089090768,
-      "loss": 0.1017,
-      "step": 80
-    },
-    {
-      "epoch": 0.29411764705882354,
-      "grad_norm": 0.152154341340065,
-      "learning_rate": 0.00019905443935134791,
-      "loss": 0.0716,
-      "step": 90
-    },
-    {
-      "epoch": 0.32679738562091504,
-      "grad_norm": 0.20485801994800568,
-      "learning_rate": 0.00019865668777995147,
-      "loss": 0.0839,
-      "step": 100
-    },
-    {
-      "epoch": 0.35947712418300654,
-      "grad_norm": 0.3825615644454956,
-      "learning_rate": 0.0001981898327972918,
-      "loss": 0.0907,
-      "step": 110
-    },
-    {
-      "epoch": 0.39215686274509803,
-      "grad_norm": 0.34921425580978394,
-      "learning_rate": 0.00019765420140879135,
-      "loss": 0.0598,
-      "step": 120
-    },
-    {
-      "epoch": 0.42483660130718953,
-      "grad_norm": 0.3151702582836151,
-      "learning_rate": 0.00019705016879384201,
-      "loss": 0.0975,
-      "step": 130
-    },
-    {
-      "epoch": 0.45751633986928103,
-      "grad_norm": 0.1447100192308426,
-      "learning_rate": 0.00019637815804301315,
-      "loss": 0.0702,
-      "step": 140
-    },
-    {
-      "epoch": 0.49019607843137253,
-      "grad_norm": 0.09320889413356781,
-      "learning_rate": 0.00019563863986170077,
-      "loss": 0.0687,
-      "step": 150
-    },
-    {
-      "epoch": 0.5228758169934641,
-      "grad_norm": 0.3998013436794281,
-      "learning_rate": 0.000194832132240425,
-      "loss": 0.0715,
-      "step": 160
-    },
-    {
-      "epoch": 0.5555555555555556,
-      "grad_norm": 0.0680910125374794,
-      "learning_rate": 0.00019395920009200723,
-      "loss": 0.0739,
-      "step": 170
-    },
-    {
-      "epoch": 0.5882352941176471,
-      "grad_norm": 0.158641055226326,
-      "learning_rate": 0.00019302045485588068,
-      "loss": 0.0582,
-      "step": 180
-    },
-    {
-      "epoch": 0.6209150326797386,
-      "grad_norm": 0.1919500231742859,
-      "learning_rate": 0.00019201655406981164,
-      "loss": 0.0741,
-      "step": 190
-    },
-    {
-      "epoch": 0.6535947712418301,
-      "grad_norm": 0.18826788663864136,
-      "learning_rate": 0.00019094820090933195,
-      "loss": 0.0618,
-      "step": 200
-    },
-    {
-      "epoch": 0.6535947712418301,
-      "eval_loss": 0.09742313623428345,
-      "eval_runtime": 17.1018,
-      "eval_samples_per_second": 4.269,
-      "eval_steps_per_second": 2.164,
-      "step": 200
-    },
-    {
-      "eval_ner_f1": 0.0,
-      "step": 200
-    },
-    {
-      "eval_ner_precision": 0.0,
-      "step": 200
-    },
-    {
-      "eval_ner_recall": 0.0,
-      "step": 200
-    },
-    {
-      "eval_ner_f1_person": 0.0,
-      "step": 200
-    },
-    {
-      "epoch": 0.6862745098039216,
-      "grad_norm": 0.2360253632068634,
-      "learning_rate": 0.00018981614369520405,
-      "loss": 0.0579,
-      "step": 210
-    },
-    {
-      "epoch": 0.7189542483660131,
-      "grad_norm": 0.1330314427614212,
-      "learning_rate": 0.00018862117536926496,
-      "loss": 0.0675,
-      "step": 220
-    },
-    {
-      "epoch": 0.7516339869281046,
-      "grad_norm": 0.16936901211738586,
-      "learning_rate": 0.0001873641329390154,
-      "loss": 0.0529,
-      "step": 230
-    },
-    {
-      "epoch": 0.7843137254901961,
-      "grad_norm": 0.08676150441169739,
-      "learning_rate": 0.00018604589689134372,
-      "loss": 0.0678,
-      "step": 240
-    },
-    {
-      "epoch": 0.8169934640522876,
-      "grad_norm": 0.07660745829343796,
-      "learning_rate": 0.00018466739057579462,
-      "loss": 0.0669,
-      "step": 250
-    },
-    {
-      "epoch": 0.8496732026143791,
-      "grad_norm": 0.12025874108076096,
-      "learning_rate": 0.00018322957955781526,
-      "loss": 0.0744,
-      "step": 260
-    },
-    {
-      "epoch": 0.8823529411764706,
-      "grad_norm": 0.2834235429763794,
-      "learning_rate": 0.00018173347094243146,
-      "loss": 0.0576,
-      "step": 270
-    },
-    {
-      "epoch": 0.9150326797385621,
-      "grad_norm": 0.11870580911636353,
-      "learning_rate": 0.0001801801126688278,
-      "loss": 0.0632,
-      "step": 280
-    },
-    {
-      "epoch": 0.9477124183006536,
-      "grad_norm": 0.16607636213302612,
-      "learning_rate": 0.00017857059277632563,
-      "loss": 0.0608,
-      "step": 290
-    },
-    {
-      "epoch": 0.9803921568627451,
-      "grad_norm": 0.11665856838226318,
-      "learning_rate": 0.0001769060386422733,
-      "loss": 0.038,
-      "step": 300
-    },
-    {
-      "epoch": 1.0130718954248366,
-      "grad_norm": 0.11913657933473587,
-      "learning_rate": 0.00017518761619238234,
-      "loss": 0.0434,
-      "step": 310
-    },
-    {
-      "epoch": 1.0457516339869282,
-      "grad_norm": 0.07901415973901749,
-      "learning_rate": 0.0001734165290840626,
-      "loss": 0.0486,
-      "step": 320
-    },
-    {
-      "epoch": 1.0784313725490196,
-      "grad_norm": 0.17951519787311554,
-      "learning_rate": 0.00017159401786332864,
-      "loss": 0.0296,
-      "step": 330
-    },
-    {
-      "epoch": 1.1111111111111112,
-      "grad_norm": 0.2891804575920105,
-      "learning_rate": 0.00016972135909586742,
-      "loss": 0.0337,
-      "step": 340
-    },
-    {
-      "epoch": 1.1437908496732025,
-      "grad_norm": 0.14063680171966553,
-      "learning_rate": 0.00016779986447287677,
-      "loss": 0.0414,
-      "step": 350
-    },
-    {
-      "epoch": 1.1764705882352942,
-      "grad_norm": 0.07126651704311371,
-      "learning_rate": 0.00016583087989229997,
-      "loss": 0.0301,
-      "step": 360
-    },
-    {
-      "epoch": 1.2091503267973855,
-      "grad_norm": 0.10456930100917816,
-      "learning_rate": 0.00016381578451610062,
-      "loss": 0.028,
-      "step": 370
-    },
-    {
-      "epoch": 1.2418300653594772,
-      "grad_norm": 0.03307747840881348,
-      "learning_rate": 0.00016175598980423797,
-      "loss": 0.0391,
-      "step": 380
-    },
-    {
-      "epoch": 1.2745098039215685,
-      "grad_norm": 0.23047687113285065,
-      "learning_rate": 0.00015965293852601944,
-      "loss": 0.0546,
-      "step": 390
-    },
-    {
-      "epoch": 1.3071895424836601,
-      "grad_norm": 0.0811915472149849,
-      "learning_rate": 0.00015750810374952226,
-      "loss": 0.0404,
-      "step": 400
-    },
-    {
-      "epoch": 1.3071895424836601,
-      "eval_loss": 0.08959854394197464,
-      "eval_runtime": 17.3406,
-      "eval_samples_per_second": 4.21,
-      "eval_steps_per_second": 2.134,
-      "step": 400
-    },
-    {
-      "eval_ner_f1": 0.5098039215686274,
-      "step": 400
-    },
-    {
-      "eval_ner_precision": 0.65,
-      "step": 400
-    },
-    {
-      "eval_ner_recall": 0.41935483870967744,
-      "step": 400
-    },
-    {
-      "eval_ner_f1_commodity": 0.0,
-      "step": 400
-    },
-    {
-      "eval_ner_f1_person": 0.5306122448979592,
-      "step": 400
-    },
-    {
-      "epoch": 1.3398692810457518,
-      "grad_norm": 0.04820709675550461,
-      "learning_rate": 0.00015532298780979336,
-      "loss": 0.0393,
-      "step": 410
-    },
-    {
-      "epoch": 1.3725490196078431,
-      "grad_norm": 0.1625761240720749,
-      "learning_rate": 0.0001530991212565484,
-      "loss": 0.0389,
-      "step": 420
-    },
-    {
-      "epoch": 1.4052287581699345,
-      "grad_norm": 0.12835846841335297,
-      "learning_rate": 0.00015083806178210892,
-      "loss": 0.0282,
-      "step": 430
-    },
-    {
-      "epoch": 1.4379084967320261,
-      "grad_norm": 0.15881434082984924,
-      "learning_rate": 0.00014854139313032726,
-      "loss": 0.0606,
-      "step": 440
-    },
-    {
-      "epoch": 1.4705882352941178,
-      "grad_norm": 0.24892982840538025,
-      "learning_rate": 0.00014621072398726356,
-      "loss": 0.0376,
-      "step": 450
-    },
-    {
-      "epoch": 1.5032679738562091,
-      "grad_norm": 0.034560319036245346,
-      "learning_rate": 0.00014384768685439273,
-      "loss": 0.0282,
-      "step": 460
-    },
-    {
-      "epoch": 1.5359477124183005,
-      "grad_norm": 0.1848367601633072,
-      "learning_rate": 0.0001414539369051298,
-      "loss": 0.0411,
-      "step": 470
-    },
-    {
-      "epoch": 1.5686274509803921,
-      "grad_norm": 0.3837202489376068,
-      "learning_rate": 0.0001390311508254747,
-      "loss": 0.0341,
-      "step": 480
-    },
-    {
-      "epoch": 1.6013071895424837,
-      "grad_norm": 0.10172578692436218,
-      "learning_rate": 0.0001365810256395891,
-      "loss": 0.0325,
-      "step": 490
-    },
-    {
-      "epoch": 1.6339869281045751,
-      "grad_norm": 0.18793442845344543,
-      "learning_rate": 0.000134105277521127,
-      "loss": 0.0357,
-      "step": 500
-    },
-    {
-      "epoch": 1.6666666666666665,
-      "grad_norm": 0.15017642080783844,
-      "learning_rate": 0.0001316056405911527,
-      "loss": 0.0232,
-      "step": 510
-    },
-    {
-      "epoch": 1.6993464052287581,
-      "grad_norm": 0.06392411887645721,
-      "learning_rate": 0.0001290838657034874,
-      "loss": 0.029,
-      "step": 520
-    },
-    {
-      "epoch": 1.7320261437908497,
-      "grad_norm": 0.0888790488243103,
-      "learning_rate": 0.00012654171921833534,
-      "loss": 0.0428,
-      "step": 530
-    },
-    {
-      "epoch": 1.7647058823529411,
-      "grad_norm": 0.06441206485033035,
-      "learning_rate": 0.00012398098176504872,
-      "loss": 0.0336,
-      "step": 540
-    },
-    {
-      "epoch": 1.7973856209150327,
-      "grad_norm": 0.16848251223564148,
-      "learning_rate": 0.00012140344699489797,
-      "loss": 0.0511,
-      "step": 550
-    },
-    {
-      "epoch": 1.8300653594771243,
-      "grad_norm": 0.07297755777835846,
-      "learning_rate": 0.00011881092032472073,
-      "loss": 0.0284,
-      "step": 560
-    },
-    {
-      "epoch": 1.8627450980392157,
-      "grad_norm": 0.09729588031768799,
-      "learning_rate": 0.00011620521767232988,
-      "loss": 0.0406,
-      "step": 570
-    },
-    {
-      "epoch": 1.8954248366013071,
-      "grad_norm": 0.038536667823791504,
-      "learning_rate": 0.00011358816418456624,
-      "loss": 0.0354,
-      "step": 580
-    },
-    {
-      "epoch": 1.9281045751633987,
-      "grad_norm": 0.20749039947986603,
-      "learning_rate": 0.00011096159295888646,
-      "loss": 0.0241,
-      "step": 590
-    },
-    {
-      "epoch": 1.9607843137254903,
-      "grad_norm": 0.13462506234645844,
-      "learning_rate": 0.00010832734375938269,
-      "loss": 0.0291,
-      "step": 600
-    },
-    {
-      "epoch": 1.9607843137254903,
-      "eval_loss": 0.09001053869724274,
-      "eval_runtime": 17.3619,
-      "eval_samples_per_second": 4.205,
-      "eval_steps_per_second": 2.131,
-      "step": 600
-    },
-    {
-      "eval_ner_f1": 0.4090909090909091,
-      "step": 600
-    },
-    {
-      "eval_ner_precision": 0.6923076923076923,
-      "step": 600
-    },
-    {
-      "eval_ner_recall": 0.2903225806451613,
-      "step": 600
-    },
-    {
-      "eval_ner_f1_person": 0.4090909090909091,
-      "step": 600
-    },
-    {
-      "epoch": 1.9934640522875817,
-      "grad_norm": 0.15671618282794952,
-      "learning_rate": 0.00010568726172813193,
-      "loss": 0.0257,
-      "step": 610
-    },
-    {
-      "epoch": 2.026143790849673,
-      "grad_norm": 0.03700106590986252,
-      "learning_rate": 0.00010304319609277888,
-      "loss": 0.0175,
-      "step": 620
-    },
-    {
-      "epoch": 2.0588235294117645,
-      "grad_norm": 0.05351804941892624,
-      "learning_rate": 0.00010039699887125678,
-      "loss": 0.0206,
-      "step": 630
-    },
-    {
-      "epoch": 2.0915032679738563,
-      "grad_norm": 0.17226466536521912,
-      "learning_rate": 9.77505235745541e-05,
-      "loss": 0.0148,
-      "step": 640
-    },
-    {
-      "epoch": 2.1241830065359477,
-      "grad_norm": 0.03601714223623276,
-      "learning_rate": 9.510562390843513e-05,
-      "loss": 0.0171,
-      "step": 650
-    },
-    {
-      "epoch": 2.156862745098039,
-      "grad_norm": 0.034734755754470825,
-      "learning_rate": 9.246415247502437e-05,
-      "loss": 0.0264,
-      "step": 660
-    },
-    {
-      "epoch": 2.189542483660131,
-      "grad_norm": 0.023934612050652504,
-      "learning_rate": 8.982795947516392e-05,
-      "loss": 0.0227,
-      "step": 670
-    },
-    {
-      "epoch": 2.2222222222222223,
-      "grad_norm": 0.07543577253818512,
-      "learning_rate": 8.719889141245256e-05,
-      "loss": 0.0183,
-      "step": 680
-    },
-    {
-      "epoch": 2.2549019607843137,
-      "grad_norm": 0.04695666953921318,
-      "learning_rate": 8.457878979987507e-05,
-      "loss": 0.0119,
-      "step": 690
-    },
-    {
-      "epoch": 2.287581699346405,
-      "grad_norm": 0.03636680543422699,
-      "learning_rate": 8.196948986992666e-05,
-      "loss": 0.0151,
-      "step": 700
-    },
-    {
-      "epoch": 2.3202614379084965,
-      "grad_norm": 0.28800758719444275,
-      "learning_rate": 7.937281928913688e-05,
-      "loss": 0.0282,
-      "step": 710
-    },
-    {
-      "epoch": 2.3529411764705883,
-      "grad_norm": 0.2814720869064331,
-      "learning_rate": 7.67905968778928e-05,
-      "loss": 0.0107,
-      "step": 720
-    },
-    {
-      "epoch": 2.3856209150326797,
-      "grad_norm": 0.06004492938518524,
-      "learning_rate": 7.42246313364587e-05,
-      "loss": 0.0141,
-      "step": 730
-    },
-    {
-      "epoch": 2.418300653594771,
-      "grad_norm": 0.48177215456962585,
-      "learning_rate": 7.167671997808405e-05,
-      "loss": 0.0148,
-      "step": 740
-    },
-    {
-      "epoch": 2.450980392156863,
-      "grad_norm": 0.07847360521554947,
-      "learning_rate": 6.914864747008762e-05,
-      "loss": 0.0153,
-      "step": 750
-    },
-    {
-      "epoch": 2.4836601307189543,
-      "grad_norm": 0.03728066757321358,
-      "learning_rate": 6.664218458379933e-05,
-      "loss": 0.0212,
-      "step": 760
-    },
-    {
-      "epoch": 2.5163398692810457,
-      "grad_norm": 0.055430226027965546,
-      "learning_rate": 6.415908695423534e-05,
-      "loss": 0.0195,
-      "step": 770
-    },
-    {
-      "epoch": 2.549019607843137,
-      "grad_norm": 0.3753442168235779,
-      "learning_rate": 6.170109385037545e-05,
-      "loss": 0.0143,
-      "step": 780
-    },
-    {
-      "epoch": 2.581699346405229,
-      "grad_norm": 0.11030200123786926,
-      "learning_rate": 5.926992695690378e-05,
-      "loss": 0.0148,
-      "step": 790
-    },
-    {
-      "epoch": 2.6143790849673203,
-      "grad_norm": 0.07732052356004715,
-      "learning_rate": 5.68672891682664e-05,
-      "loss": 0.0204,
-      "step": 800
-    },
-    {
-      "epoch": 2.6143790849673203,
-      "eval_loss": 0.09645482152700424,
-      "eval_runtime": 16.7055,
-      "eval_samples_per_second": 4.37,
-      "eval_steps_per_second": 2.215,
-      "step": 800
-    },
-    {
-      "eval_ner_f1": 0.2564102564102564,
-      "step": 800
-    },
-    {
-      "eval_ner_precision": 0.625,
-      "step": 800
-    },
-    {
-      "eval_ner_recall": 0.16129032258064516,
-      "step": 800
-    },
-    {
-      "eval_ner_f1_person": 0.2564102564102564,
-      "step": 800
-    },
-    {
-      "epoch": 2.6470588235294117,
-      "grad_norm": 0.040318384766578674,
-      "learning_rate": 5.449486339589043e-05,
-      "loss": 0.0199,
-      "step": 810
-    },
-    {
-      "epoch": 2.6797385620915035,
-      "grad_norm": 0.11425317078828812,
-      "learning_rate": 5.215431138939999e-05,
-      "loss": 0.0215,
-      "step": 820
-    },
-    {
-      "epoch": 2.712418300653595,
-      "grad_norm": 0.07913585007190704,
-      "learning_rate": 4.984727257265509e-05,
-      "loss": 0.0094,
-      "step": 830
-    },
-    {
-      "epoch": 2.7450980392156863,
-      "grad_norm": 0.11035473644733429,
-      "learning_rate": 4.757536289542798e-05,
-      "loss": 0.0183,
-      "step": 840
-    },
-    {
-      "epoch": 2.7777777777777777,
-      "grad_norm": 0.04927445575594902,
-      "learning_rate": 4.534017370152218e-05,
-      "loss": 0.0181,
-      "step": 850
-    },
-    {
-      "epoch": 2.810457516339869,
-      "grad_norm": 0.043610814958810806,
-      "learning_rate": 4.314327061412656e-05,
-      "loss": 0.0123,
-      "step": 860
-    },
-    {
-      "epoch": 2.843137254901961,
-      "grad_norm": 0.038343094289302826,
-      "learning_rate": 4.0986192439184864e-05,
-      "loss": 0.0119,
-      "step": 870
-    },
-    {
-      "epoch": 2.8758169934640523,
-      "grad_norm": 0.13254104554653168,
-      "learning_rate": 3.88704500875498e-05,
-      "loss": 0.0144,
-      "step": 880
-    },
-    {
-      "epoch": 2.9084967320261437,
-      "grad_norm": 0.13977399468421936,
-      "learning_rate": 3.679752551667541e-05,
-      "loss": 0.0085,
-      "step": 890
-    },
-    {
-      "epoch": 2.9411764705882355,
-      "grad_norm": 0.020304501056671143,
-      "learning_rate": 3.4768870692590147e-05,
-      "loss": 0.016,
-      "step": 900
-    },
-    {
-      "epoch": 2.973856209150327,
-      "grad_norm": 0.06801599264144897,
-      "learning_rate": 3.278590657287713e-05,
-      "loss": 0.0111,
-      "step": 910
-    },
-    {
-      "epoch": 3.0065359477124183,
-      "grad_norm": 0.06201297789812088,
-      "learning_rate": 3.08500221113738e-05,
-      "loss": 0.014,
-      "step": 920
-    },
-    {
-      "epoch": 3.0392156862745097,
-      "grad_norm": 0.04338189586997032,
-      "learning_rate": 2.8962573285288695e-05,
-      "loss": 0.0088,
-      "step": 930
-    },
-    {
-      "epoch": 3.0718954248366015,
-      "grad_norm": 0.009708147495985031,
-      "learning_rate": 2.712488214541642e-05,
-      "loss": 0.0057,
-      "step": 940
-    },
-    {
-      "epoch": 3.104575163398693,
-      "grad_norm": 0.04377015307545662,
-      "learning_rate": 2.5338235890115902e-05,
-      "loss": 0.011,
-      "step": 950
-    },
-    {
-      "epoch": 3.1372549019607843,
-      "grad_norm": 0.09793521463871002,
-      "learning_rate": 2.360388596370122e-05,
-      "loss": 0.0112,
-      "step": 960
-    },
-    {
-      "epoch": 3.1699346405228757,
-      "grad_norm": 0.01356339082121849,
-      "learning_rate": 2.1923047179875654e-05,
-      "loss": 0.0044,
-      "step": 970
-    },
-    {
-      "epoch": 3.2026143790849675,
-      "grad_norm": 0.02072557434439659,
-      "learning_rate": 2.0296896870823766e-05,
-      "loss": 0.0055,
-      "step": 980
-    },
-    {
-      "epoch": 3.235294117647059,
-      "grad_norm": 0.09715542942285538,
-      "learning_rate": 1.8726574062557012e-05,
-      "loss": 0.0105,
-      "step": 990
-    },
-    {
-      "epoch": 3.2679738562091503,
-      "grad_norm": 0.03253450244665146,
-      "learning_rate": 1.721317867709057e-05,
-      "loss": 0.0024,
-      "step": 1000
-    },
-    {
-      "epoch": 3.2679738562091503,
-      "eval_loss": 0.1047215685248375,
-      "eval_runtime": 16.8699,
-      "eval_samples_per_second": 4.327,
-      "eval_steps_per_second": 2.193,
-      "step": 1000
-    },
-    {
-      "eval_ner_f1": 0.05714285714285715,
-      "step": 1000
-    },
-    {
-      "eval_ner_precision": 0.25,
-      "step": 1000
-    },
-    {
-      "eval_ner_recall": 0.03225806451612903,
-      "step": 1000
-    },
-    {
-      "eval_ner_f1_person": 0.05714285714285715,
-      "step": 1000
-    },
-    {
-      "epoch": 3.3006535947712417,
-      "grad_norm": 0.09732872247695923,
-      "learning_rate": 1.5757770762010438e-05,
-      "loss": 0.0089,
-      "step": 1010
-    },
-    {
-      "epoch": 3.3333333333333335,
-      "grad_norm": 0.032012488692998886,
-      "learning_rate": 1.4361369747970311e-05,
-      "loss": 0.0077,
-      "step": 1020
-    },
-    {
-      "epoch": 3.366013071895425,
-      "grad_norm": 0.08777438849210739,
-      "learning_rate": 1.3024953734638168e-05,
-      "loss": 0.0027,
-      "step": 1030
-    },
-    {
-      "epoch": 3.3986928104575163,
-      "grad_norm": 0.028469018638134003,
-      "learning_rate": 1.1749458805592983e-05,
-      "loss": 0.0077,
-      "step": 1040
-    },
-    {
-      "epoch": 3.431372549019608,
-      "grad_norm": 0.06986037641763687,
-      "learning_rate": 1.0535778372651317e-05,
-      "loss": 0.0049,
-      "step": 1050
-    },
-    {
-      "epoch": 3.4640522875816995,
-      "grad_norm": 0.008913267403841019,
-      "learning_rate": 9.384762550083037e-06,
-      "loss": 0.0079,
-      "step": 1060
-    },
-    {
-      "epoch": 3.496732026143791,
-      "grad_norm": 0.0693899616599083,
-      "learning_rate": 8.297217559154535e-06,
-      "loss": 0.0049,
-      "step": 1070
-    },
-    {
-      "epoch": 3.5294117647058822,
-      "grad_norm": 0.13099117577075958,
-      "learning_rate": 7.273905163416395e-06,
-      "loss": 0.0053,
-      "step": 1080
-    },
-    {
-      "epoch": 3.5620915032679736,
-      "grad_norm": 0.009806470945477486,
-      "learning_rate": 6.315542135131381e-06,
-      "loss": 0.004,
-      "step": 1090
-    },
-    {
-      "epoch": 3.5947712418300655,
-      "grad_norm": 0.18246600031852722,
-      "learning_rate": 5.422799753216023e-06,
-      "loss": 0.0116,
-      "step": 1100
-    },
-    {
-      "epoch": 3.627450980392157,
-      "grad_norm": 0.013247921131551266,
-      "learning_rate": 4.596303333047891e-06,
-      "loss": 0.0069,
-      "step": 1110
-    },
-    {
-      "epoch": 3.6601307189542482,
-      "grad_norm": 0.03266795352101326,
-      "learning_rate": 3.836631788467671e-06,
-      "loss": 0.0053,
-      "step": 1120
-    },
-    {
-      "epoch": 3.69281045751634,
-      "grad_norm": 0.0331471785902977,
-      "learning_rate": 3.1443172262828223e-06,
-      "loss": 0.0031,
-      "step": 1130
-    },
-    {
-      "epoch": 3.7254901960784315,
-      "grad_norm": 0.03891368955373764,
-      "learning_rate": 2.519844573556984e-06,
-      "loss": 0.0084,
-      "step": 1140
-    },
-    {
-      "epoch": 3.758169934640523,
-      "grad_norm": 0.055439382791519165,
-      "learning_rate": 1.963651237946107e-06,
-      "loss": 0.0076,
-      "step": 1150
-    },
-    {
-      "epoch": 3.7908496732026142,
-      "grad_norm": 0.04643230885267258,
-      "learning_rate": 1.4761268013191553e-06,
-      "loss": 0.0052,
-      "step": 1160
-    },
-    {
-      "epoch": 3.8235294117647056,
-      "grad_norm": 0.006929585710167885,
-      "learning_rate": 1.0576127468781783e-06,
-      "loss": 0.0082,
-      "step": 1170
-    },
-    {
-      "epoch": 3.8562091503267975,
-      "grad_norm": 0.06278964877128601,
-      "learning_rate": 7.084022199686513e-07,
-      "loss": 0.0071,
-      "step": 1180
-    },
-    {
-      "epoch": 3.888888888888889,
-      "grad_norm": 0.04118943214416504,
-      "learning_rate": 4.2873982274781453e-07,
-      "loss": 0.0144,
-      "step": 1190
-    },
-    {
-      "epoch": 3.9215686274509802,
-      "grad_norm": 0.014042374677956104,
-      "learning_rate": 2.1882144285477746e-07,
-      "loss": 0.0029,
-      "step": 1200
-    },
-    {
-      "epoch": 3.9215686274509802,
-      "eval_loss": 0.10614956170320511,
-      "eval_runtime": 18.3168,
-      "eval_samples_per_second": 3.985,
-      "eval_steps_per_second": 2.02,
-      "step": 1200
-    },
-    {
-      "eval_ner_f1": 0.05714285714285715,
-      "step": 1200
-    },
-    {
-      "eval_ner_precision": 0.25,
-      "step": 1200
-    },
-    {
-      "eval_ner_recall": 0.03225806451612903,
-      "step": 1200
-    },
-    {
-      "eval_ner_f1_person": 0.05714285714285715,
-      "step": 1200
-    },
-    {
-      "epoch": 3.954248366013072,
-      "grad_norm": 0.16615094244480133,
-      "learning_rate": 7.87941162023076e-08,
-      "loss": 0.0037,
-      "step": 1210
-    },
-    {
-      "epoch": 3.9869281045751634,
-      "grad_norm": 0.019119175150990486,
-      "learning_rate": 8.755923986480952e-09,
-      "loss": 0.0078,
-      "step": 1220
-    }
-  ],
-  "logging_steps": 10,
-  "max_steps": 1224,
-  "num_input_tokens_seen": 0,
-  "num_train_epochs": 4,
-  "save_steps": 200,
-  "stateful_callbacks": {
-    "TrainerControl": {
-      "args": {
-        "should_epoch_stop": false,
-        "should_evaluate": false,
-        "should_log": false,
-        "should_save": true,
-        "should_training_stop": true
-      },
-      "attributes": {}
-    }
-  },
-  "total_flos": 2.033937038405929e+17,
-  "train_batch_size": 2,
-  "trial_name": null,
-  "trial_params": null
-}

person_lora/checkpoint-1224/training_args.bin DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:822f3da171fb89395f6d9447d0736b055ce3407cae8df6d7d4b3e34103c10c62
-size 5713

person_lora/checkpoint-1224/vocab.json DELETED Viewed

The diff for this file is too large to render. See raw diff