lucianodelcorro commited on Oct 18, 2024

Commit

a4981b5

1 Parent(s): cd6606b

Upload LoRa adapters

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

fine-tuned-model_full/README.md +202 -0
fine-tuned-model_full/adapter_config.json +31 -0
fine-tuned-model_full/adapter_model.safetensors +3 -0
fine-tuned-model_full/special_tokens_map.json +24 -0
fine-tuned-model_full/tokenizer.model +3 -0
fine-tuned-model_full/tokenizer_config.json +0 -0
fine-tuned-model_full/training_args.bin +3 -0
full_final-tesis/checkpoint-1000/README.md +202 -0
full_final-tesis/checkpoint-1000/adapter_config.json +31 -0
full_final-tesis/checkpoint-1000/adapter_model.safetensors +3 -0
full_final-tesis/checkpoint-1000/optimizer.pt +3 -0
full_final-tesis/checkpoint-1000/rng_state.pth +3 -0
full_final-tesis/checkpoint-1000/scheduler.pt +3 -0
full_final-tesis/checkpoint-1000/trainer_state.json +103 -0
full_final-tesis/checkpoint-1000/training_args.bin +3 -0
full_final-tesis/checkpoint-10000/README.md +202 -0
full_final-tesis/checkpoint-10000/adapter_config.json +31 -0
full_final-tesis/checkpoint-10000/adapter_model.safetensors +3 -0
full_final-tesis/checkpoint-10000/optimizer.pt +3 -0
full_final-tesis/checkpoint-10000/rng_state.pth +3 -0
full_final-tesis/checkpoint-10000/scheduler.pt +3 -0
full_final-tesis/checkpoint-10000/trainer_state.json +741 -0
full_final-tesis/checkpoint-10000/training_args.bin +3 -0
full_final-tesis/checkpoint-11000/README.md +202 -0
full_final-tesis/checkpoint-11000/adapter_config.json +31 -0
full_final-tesis/checkpoint-11000/adapter_model.safetensors +3 -0
full_final-tesis/checkpoint-11000/optimizer.pt +3 -0
full_final-tesis/checkpoint-11000/rng_state.pth +3 -0
full_final-tesis/checkpoint-11000/scheduler.pt +3 -0
full_final-tesis/checkpoint-11000/trainer_state.json +811 -0
full_final-tesis/checkpoint-11000/training_args.bin +3 -0
full_final-tesis/checkpoint-12000/README.md +202 -0
full_final-tesis/checkpoint-12000/adapter_config.json +31 -0
full_final-tesis/checkpoint-12000/adapter_model.safetensors +3 -0
full_final-tesis/checkpoint-12000/optimizer.pt +3 -0
full_final-tesis/checkpoint-12000/rng_state.pth +3 -0
full_final-tesis/checkpoint-12000/scheduler.pt +3 -0
full_final-tesis/checkpoint-12000/trainer_state.json +881 -0
full_final-tesis/checkpoint-12000/training_args.bin +3 -0
full_final-tesis/checkpoint-13000/README.md +202 -0
full_final-tesis/checkpoint-13000/adapter_config.json +31 -0
full_final-tesis/checkpoint-13000/adapter_model.safetensors +3 -0
full_final-tesis/checkpoint-13000/optimizer.pt +3 -0
full_final-tesis/checkpoint-13000/rng_state.pth +3 -0
full_final-tesis/checkpoint-13000/scheduler.pt +3 -0
full_final-tesis/checkpoint-13000/trainer_state.json +951 -0
full_final-tesis/checkpoint-13000/training_args.bin +3 -0
full_final-tesis/checkpoint-14000/README.md +202 -0
full_final-tesis/checkpoint-14000/adapter_config.json +31 -0
full_final-tesis/checkpoint-14000/adapter_model.safetensors +3 -0

fine-tuned-model_full/README.md ADDED Viewed

	@@ -0,0 +1,202 @@

+---
+base_model: mistralai/Mistral-7B-Instruct-v0.3
+library_name: peft
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.12.1.dev0

fine-tuned-model_full/adapter_config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "mistralai/Mistral-7B-Instruct-v0.3",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 16,
+  "lora_dropout": 0.1,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 64,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "q_proj",
+    "v_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM",
+  "use_dora": false,
+  "use_rslora": false
+}

fine-tuned-model_full/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b341c2afff7f043fbceac15611a66fef95529761d6a43b5e8c26912acb72334f
+size 218138576

fine-tuned-model_full/special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": "</s>",
+  "unk_token": {
+    "content": "<unk>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

fine-tuned-model_full/tokenizer.model ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:37f00374dea48658ee8f5d0f21895b9bc55cb0103939607c8185bfd1c6ca1f89
+size 587404

fine-tuned-model_full/tokenizer_config.json ADDED Viewed

The diff for this file is too large to render. See raw diff

fine-tuned-model_full/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:58a8dc2e9cd30fabd0bf191d8b86269c1d642655415287312de73d5ceda9dace
+size 5176

full_final-tesis/checkpoint-1000/README.md ADDED Viewed

	@@ -0,0 +1,202 @@

+---
+base_model: mistralai/Mistral-7B-Instruct-v0.3
+library_name: peft
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.12.1.dev0

full_final-tesis/checkpoint-1000/adapter_config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "mistralai/Mistral-7B-Instruct-v0.3",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 16,
+  "lora_dropout": 0.1,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 64,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "q_proj",
+    "v_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM",
+  "use_dora": false,
+  "use_rslora": false
+}

full_final-tesis/checkpoint-1000/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:94bc318eefca0c521da287658c012c9e25046e90f41f60352af2f52d2bcf1e5c
+size 218138576

full_final-tesis/checkpoint-1000/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7046a34e49718c1a343b0562f9bb8e3ec1aca3fedd4779c8245d1e0b043a5896
+size 109570132

full_final-tesis/checkpoint-1000/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8c93e4cee2e47d9e8839562818f8f565ae7c9f4f4a66d703680c357470a08df2
+size 14244

full_final-tesis/checkpoint-1000/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:088c4075160551b6162ebaa297c626ab21346432717fbf07b1bd5ea438049151
+size 1064

full_final-tesis/checkpoint-1000/trainer_state.json ADDED Viewed

	@@ -0,0 +1,103 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 0.1137182062848262,
+  "eval_steps": 500,
+  "global_step": 1000,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.01137182062848262,
+      "grad_norm": 0.28454989194869995,
+      "learning_rate": 1.5e-05,
+      "loss": 0.8172,
+      "step": 100
+    },
+    {
+      "epoch": 0.02274364125696524,
+      "grad_norm": 0.3702525198459625,
+      "learning_rate": 1.499946406987345e-05,
+      "loss": 0.3711,
+      "step": 200
+    },
+    {
+      "epoch": 0.03411546188544786,
+      "grad_norm": 0.42058178782463074,
+      "learning_rate": 1.4997856356086094e-05,
+      "loss": 0.3401,
+      "step": 300
+    },
+    {
+      "epoch": 0.04548728251393048,
+      "grad_norm": 0.4536089599132538,
+      "learning_rate": 1.4995177088403865e-05,
+      "loss": 0.3276,
+      "step": 400
+    },
+    {
+      "epoch": 0.0568591031424131,
+      "grad_norm": 0.3354353904724121,
+      "learning_rate": 1.4991426649733503e-05,
+      "loss": 0.3191,
+      "step": 500
+    },
+    {
+      "epoch": 0.06823092377089572,
+      "grad_norm": 0.3974071741104126,
+      "learning_rate": 1.4986605576067824e-05,
+      "loss": 0.3138,
+      "step": 600
+    },
+    {
+      "epoch": 0.07960274439937834,
+      "grad_norm": 0.3995005786418915,
+      "learning_rate": 1.4980714556409132e-05,
+      "loss": 0.3027,
+      "step": 700
+    },
+    {
+      "epoch": 0.09097456502786096,
+      "grad_norm": 0.35224294662475586,
+      "learning_rate": 1.4973754432670731e-05,
+      "loss": 0.2784,
+      "step": 800
+    },
+    {
+      "epoch": 0.10234638565634357,
+      "grad_norm": 0.28719601035118103,
+      "learning_rate": 1.4965726199556621e-05,
+      "loss": 0.2737,
+      "step": 900
+    },
+    {
+      "epoch": 0.1137182062848262,
+      "grad_norm": 0.32742446660995483,
+      "learning_rate": 1.4956631004419335e-05,
+      "loss": 0.2722,
+      "step": 1000
+    }
+  ],
+  "logging_steps": 100,
+  "max_steps": 26379,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 3,
+  "save_steps": 1000,
+  "stateful_callbacks": {
+    "TrainerControl": {
+      "args": {
+        "should_epoch_stop": false,
+        "should_evaluate": false,
+        "should_log": false,
+        "should_save": true,
+        "should_training_stop": false
+      },
+      "attributes": {}
+    }
+  },
+  "total_flos": 5.677318766592e+17,
+  "train_batch_size": 11,
+  "trial_name": null,
+  "trial_params": null
+}

full_final-tesis/checkpoint-1000/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:58a8dc2e9cd30fabd0bf191d8b86269c1d642655415287312de73d5ceda9dace
+size 5176

full_final-tesis/checkpoint-10000/README.md ADDED Viewed

	@@ -0,0 +1,202 @@

+---
+base_model: mistralai/Mistral-7B-Instruct-v0.3
+library_name: peft
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.12.1.dev0

full_final-tesis/checkpoint-10000/adapter_config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "mistralai/Mistral-7B-Instruct-v0.3",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 16,
+  "lora_dropout": 0.1,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 64,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "q_proj",
+    "v_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM",
+  "use_dora": false,
+  "use_rslora": false
+}

full_final-tesis/checkpoint-10000/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:97524790c360feec7a8b1cb4899a87d378fb04046caafb47ebcd0e35b087d771
+size 218138576

full_final-tesis/checkpoint-10000/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2024f8f84c757ea4ab3418ff1f36c8e7ac54cdd0b31a83072d11c5bd20d7aa02
+size 109570132

full_final-tesis/checkpoint-10000/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:63d45f29e7850d2bc663b2a65f53ccac45f1f37ecfacd2a33e0596e57f96894c
+size 14244

full_final-tesis/checkpoint-10000/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c4ca5506a9c0540d488239b240f97d144a8133bc40f0c45b0c85cf908430c716
+size 1064

full_final-tesis/checkpoint-10000/trainer_state.json ADDED Viewed

	@@ -0,0 +1,741 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 1.137182062848262,
+  "eval_steps": 500,
+  "global_step": 10000,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.01137182062848262,
+      "grad_norm": 0.28454989194869995,
+      "learning_rate": 1.5e-05,
+      "loss": 0.8172,
+      "step": 100
+    },
+    {
+      "epoch": 0.02274364125696524,
+      "grad_norm": 0.3702525198459625,
+      "learning_rate": 1.499946406987345e-05,
+      "loss": 0.3711,
+      "step": 200
+    },
+    {
+      "epoch": 0.03411546188544786,
+      "grad_norm": 0.42058178782463074,
+      "learning_rate": 1.4997856356086094e-05,
+      "loss": 0.3401,
+      "step": 300
+    },
+    {
+      "epoch": 0.04548728251393048,
+      "grad_norm": 0.4536089599132538,
+      "learning_rate": 1.4995177088403865e-05,
+      "loss": 0.3276,
+      "step": 400
+    },
+    {
+      "epoch": 0.0568591031424131,
+      "grad_norm": 0.3354353904724121,
+      "learning_rate": 1.4991426649733503e-05,
+      "loss": 0.3191,
+      "step": 500
+    },
+    {
+      "epoch": 0.06823092377089572,
+      "grad_norm": 0.3974071741104126,
+      "learning_rate": 1.4986605576067824e-05,
+      "loss": 0.3138,
+      "step": 600
+    },
+    {
+      "epoch": 0.07960274439937834,
+      "grad_norm": 0.3995005786418915,
+      "learning_rate": 1.4980714556409132e-05,
+      "loss": 0.3027,
+      "step": 700
+    },
+    {
+      "epoch": 0.09097456502786096,
+      "grad_norm": 0.35224294662475586,
+      "learning_rate": 1.4973754432670731e-05,
+      "loss": 0.2784,
+      "step": 800
+    },
+    {
+      "epoch": 0.10234638565634357,
+      "grad_norm": 0.28719601035118103,
+      "learning_rate": 1.4965726199556621e-05,
+      "loss": 0.2737,
+      "step": 900
+    },
+    {
+      "epoch": 0.1137182062848262,
+      "grad_norm": 0.32742446660995483,
+      "learning_rate": 1.4956631004419335e-05,
+      "loss": 0.2722,
+      "step": 1000
+    },
+    {
+      "epoch": 0.1250900269133088,
+      "grad_norm": 0.44312119483947754,
+      "learning_rate": 1.4946470147095961e-05,
+      "loss": 0.2709,
+      "step": 1100
+    },
+    {
+      "epoch": 0.13646184754179144,
+      "grad_norm": 0.40551960468292236,
+      "learning_rate": 1.4935245079722374e-05,
+      "loss": 0.2683,
+      "step": 1200
+    },
+    {
+      "epoch": 0.14783366817027407,
+      "grad_norm": 0.38514164090156555,
+      "learning_rate": 1.4922957406525721e-05,
+      "loss": 0.2667,
+      "step": 1300
+    },
+    {
+      "epoch": 0.15920548879875668,
+      "grad_norm": 0.4830244481563568,
+      "learning_rate": 1.4909608883595135e-05,
+      "loss": 0.2644,
+      "step": 1400
+    },
+    {
+      "epoch": 0.1705773094272393,
+      "grad_norm": 0.5327072739601135,
+      "learning_rate": 1.489520141863077e-05,
+      "loss": 0.265,
+      "step": 1500
+    },
+    {
+      "epoch": 0.1819491300557219,
+      "grad_norm": 0.3729603588581085,
+      "learning_rate": 1.4879737070671164e-05,
+      "loss": 0.2625,
+      "step": 1600
+    },
+    {
+      "epoch": 0.19332095068420455,
+      "grad_norm": 0.3948141634464264,
+      "learning_rate": 1.4863218049798972e-05,
+      "loss": 0.2632,
+      "step": 1700
+    },
+    {
+      "epoch": 0.20469277131268715,
+      "grad_norm": 0.42538341879844666,
+      "learning_rate": 1.4845646716825118e-05,
+      "loss": 0.2628,
+      "step": 1800
+    },
+    {
+      "epoch": 0.21606459194116978,
+      "grad_norm": 0.39089226722717285,
+      "learning_rate": 1.4827025582951387e-05,
+      "loss": 0.2604,
+      "step": 1900
+    },
+    {
+      "epoch": 0.2274364125696524,
+      "grad_norm": 0.44530946016311646,
+      "learning_rate": 1.4807357309411546e-05,
+      "loss": 0.2595,
+      "step": 2000
+    },
+    {
+      "epoch": 0.23880823319813502,
+      "grad_norm": 0.5049150586128235,
+      "learning_rate": 1.4786644707091018e-05,
+      "loss": 0.2602,
+      "step": 2100
+    },
+    {
+      "epoch": 0.2501800538266176,
+      "grad_norm": 0.4439440369606018,
+      "learning_rate": 1.4764890736125158e-05,
+      "loss": 0.2591,
+      "step": 2200
+    },
+    {
+      "epoch": 0.26155187445510025,
+      "grad_norm": 0.3611554801464081,
+      "learning_rate": 1.4742098505476209e-05,
+      "loss": 0.2577,
+      "step": 2300
+    },
+    {
+      "epoch": 0.2729236950835829,
+      "grad_norm": 0.5148597359657288,
+      "learning_rate": 1.4718271272488986e-05,
+      "loss": 0.2572,
+      "step": 2400
+    },
+    {
+      "epoch": 0.2842955157120655,
+      "grad_norm": 0.4029059410095215,
+      "learning_rate": 1.4693412442425354e-05,
+      "loss": 0.2572,
+      "step": 2500
+    },
+    {
+      "epoch": 0.29566733634054815,
+      "grad_norm": 0.3884616196155548,
+      "learning_rate": 1.4667525567977561e-05,
+      "loss": 0.2555,
+      "step": 2600
+    },
+    {
+      "epoch": 0.3070391569690307,
+      "grad_norm": 0.32239723205566406,
+      "learning_rate": 1.4640614348760517e-05,
+      "loss": 0.2551,
+      "step": 2700
+    },
+    {
+      "epoch": 0.31841097759751336,
+      "grad_norm": 0.36961373686790466,
+      "learning_rate": 1.4612682630783053e-05,
+      "loss": 0.256,
+      "step": 2800
+    },
+    {
+      "epoch": 0.329782798225996,
+      "grad_norm": 0.44517841935157776,
+      "learning_rate": 1.4583734405898277e-05,
+      "loss": 0.2548,
+      "step": 2900
+    },
+    {
+      "epoch": 0.3411546188544786,
+      "grad_norm": 0.36591875553131104,
+      "learning_rate": 1.4553773811233073e-05,
+      "loss": 0.2544,
+      "step": 3000
+    },
+    {
+      "epoch": 0.3525264394829612,
+      "grad_norm": 0.39926204085350037,
+      "learning_rate": 1.4522805128596852e-05,
+      "loss": 0.2551,
+      "step": 3100
+    },
+    {
+      "epoch": 0.3638982601114438,
+      "grad_norm": 0.4806652367115021,
+      "learning_rate": 1.4490832783869617e-05,
+      "loss": 0.2539,
+      "step": 3200
+    },
+    {
+      "epoch": 0.37527008073992646,
+      "grad_norm": 0.37937217950820923,
+      "learning_rate": 1.4457861346369439e-05,
+      "loss": 0.2549,
+      "step": 3300
+    },
+    {
+      "epoch": 0.3866419013684091,
+      "grad_norm": 0.32821956276893616,
+      "learning_rate": 1.4423895528199423e-05,
+      "loss": 0.2531,
+      "step": 3400
+    },
+    {
+      "epoch": 0.3980137219968917,
+      "grad_norm": 0.4085944890975952,
+      "learning_rate": 1.4388940183574303e-05,
+      "loss": 0.2522,
+      "step": 3500
+    },
+    {
+      "epoch": 0.4093855426253743,
+      "grad_norm": 0.42145103216171265,
+      "learning_rate": 1.4353000308126683e-05,
+      "loss": 0.2525,
+      "step": 3600
+    },
+    {
+      "epoch": 0.42075736325385693,
+      "grad_norm": 0.39684563875198364,
+      "learning_rate": 1.4316081038193093e-05,
+      "loss": 0.2512,
+      "step": 3700
+    },
+    {
+      "epoch": 0.43212918388233956,
+      "grad_norm": 0.40946340560913086,
+      "learning_rate": 1.4278187650079938e-05,
+      "loss": 0.2513,
+      "step": 3800
+    },
+    {
+      "epoch": 0.4435010045108222,
+      "grad_norm": 0.36055612564086914,
+      "learning_rate": 1.4239325559309426e-05,
+      "loss": 0.2508,
+      "step": 3900
+    },
+    {
+      "epoch": 0.4548728251393048,
+      "grad_norm": 0.33412662148475647,
+      "learning_rate": 1.4199500319845618e-05,
+      "loss": 0.2521,
+      "step": 4000
+    },
+    {
+      "epoch": 0.4662446457677874,
+      "grad_norm": 0.435330867767334,
+      "learning_rate": 1.415871762330068e-05,
+      "loss": 0.2502,
+      "step": 4100
+    },
+    {
+      "epoch": 0.47761646639627003,
+      "grad_norm": 0.349936306476593,
+      "learning_rate": 1.4116983298121471e-05,
+      "loss": 0.25,
+      "step": 4200
+    },
+    {
+      "epoch": 0.48898828702475267,
+      "grad_norm": 0.4145605266094208,
+      "learning_rate": 1.407430330875657e-05,
+      "loss": 0.2486,
+      "step": 4300
+    },
+    {
+      "epoch": 0.5003601076532352,
+      "grad_norm": 0.37989580631256104,
+      "learning_rate": 1.4030683754803873e-05,
+      "loss": 0.2493,
+      "step": 4400
+    },
+    {
+      "epoch": 0.5117319282817179,
+      "grad_norm": 0.31419941782951355,
+      "learning_rate": 1.3986130870138861e-05,
+      "loss": 0.249,
+      "step": 4500
+    },
+    {
+      "epoch": 0.5231037489102005,
+      "grad_norm": 0.34506115317344666,
+      "learning_rate": 1.3940651022023705e-05,
+      "loss": 0.2505,
+      "step": 4600
+    },
+    {
+      "epoch": 0.5344755695386831,
+      "grad_norm": 0.3678961396217346,
+      "learning_rate": 1.3894250710197268e-05,
+      "loss": 0.2478,
+      "step": 4700
+    },
+    {
+      "epoch": 0.5458473901671658,
+      "grad_norm": 0.3394433557987213,
+      "learning_rate": 1.3846936565946217e-05,
+      "loss": 0.2486,
+      "step": 4800
+    },
+    {
+      "epoch": 0.5572192107956484,
+      "grad_norm": 0.3915737569332123,
+      "learning_rate": 1.3798715351157302e-05,
+      "loss": 0.2478,
+      "step": 4900
+    },
+    {
+      "epoch": 0.568591031424131,
+      "grad_norm": 0.34704986214637756,
+      "learning_rate": 1.3749593957350986e-05,
+      "loss": 0.2485,
+      "step": 5000
+    },
+    {
+      "epoch": 0.5799628520526137,
+      "grad_norm": 0.3951033353805542,
+      "learning_rate": 1.369957940469655e-05,
+      "loss": 0.2483,
+      "step": 5100
+    },
+    {
+      "epoch": 0.5913346726810963,
+      "grad_norm": 0.3708275556564331,
+      "learning_rate": 1.3648678841008805e-05,
+      "loss": 0.2455,
+      "step": 5200
+    },
+    {
+      "epoch": 0.6027064933095788,
+      "grad_norm": 0.3068831264972687,
+      "learning_rate": 1.3596899540726558e-05,
+      "loss": 0.2477,
+      "step": 5300
+    },
+    {
+      "epoch": 0.6140783139380614,
+      "grad_norm": 0.3610328137874603,
+      "learning_rate": 1.3544248903872996e-05,
+      "loss": 0.2475,
+      "step": 5400
+    },
+    {
+      "epoch": 0.6254501345665441,
+      "grad_norm": 0.4052928388118744,
+      "learning_rate": 1.3490734454998117e-05,
+      "loss": 0.2457,
+      "step": 5500
+    },
+    {
+      "epoch": 0.6368219551950267,
+      "grad_norm": 0.36340612173080444,
+      "learning_rate": 1.3436363842103345e-05,
+      "loss": 0.2469,
+      "step": 5600
+    },
+    {
+      "epoch": 0.6481937758235093,
+      "grad_norm": 0.33248743414878845,
+      "learning_rate": 1.3381144835548534e-05,
+      "loss": 0.2466,
+      "step": 5700
+    },
+    {
+      "epoch": 0.659565596451992,
+      "grad_norm": 0.32214176654815674,
+      "learning_rate": 1.3325085326941464e-05,
+      "loss": 0.2457,
+      "step": 5800
+    },
+    {
+      "epoch": 0.6709374170804746,
+      "grad_norm": 0.45882582664489746,
+      "learning_rate": 1.3268193328010013e-05,
+      "loss": 0.2436,
+      "step": 5900
+    },
+    {
+      "epoch": 0.6823092377089572,
+      "grad_norm": 0.3200742304325104,
+      "learning_rate": 1.321047696945716e-05,
+      "loss": 0.2459,
+      "step": 6000
+    },
+    {
+      "epoch": 0.6936810583374399,
+      "grad_norm": 0.3391638696193695,
+      "learning_rate": 1.3151944499799003e-05,
+      "loss": 0.2461,
+      "step": 6100
+    },
+    {
+      "epoch": 0.7050528789659224,
+      "grad_norm": 0.34882161021232605,
+      "learning_rate": 1.3092604284185901e-05,
+      "loss": 0.2455,
+      "step": 6200
+    },
+    {
+      "epoch": 0.716424699594405,
+      "grad_norm": 0.31918397545814514,
+      "learning_rate": 1.3032464803206998e-05,
+      "loss": 0.2438,
+      "step": 6300
+    },
+    {
+      "epoch": 0.7277965202228877,
+      "grad_norm": 0.308662474155426,
+      "learning_rate": 1.2971534651678194e-05,
+      "loss": 0.2451,
+      "step": 6400
+    },
+    {
+      "epoch": 0.7391683408513703,
+      "grad_norm": 0.36811548471450806,
+      "learning_rate": 1.2909822537413848e-05,
+      "loss": 0.2448,
+      "step": 6500
+    },
+    {
+      "epoch": 0.7505401614798529,
+      "grad_norm": 0.3129611611366272,
+      "learning_rate": 1.2847337279982274e-05,
+      "loss": 0.2441,
+      "step": 6600
+    },
+    {
+      "epoch": 0.7619119821083356,
+      "grad_norm": 0.31695765256881714,
+      "learning_rate": 1.2784087809445326e-05,
+      "loss": 0.2434,
+      "step": 6700
+    },
+    {
+      "epoch": 0.7732838027368182,
+      "grad_norm": 0.3517361879348755,
+      "learning_rate": 1.2720083165082133e-05,
+      "loss": 0.2444,
+      "step": 6800
+    },
+    {
+      "epoch": 0.7846556233653008,
+      "grad_norm": 0.3190363943576813,
+      "learning_rate": 1.2655332494097267e-05,
+      "loss": 0.2452,
+      "step": 6900
+    },
+    {
+      "epoch": 0.7960274439937834,
+      "grad_norm": 0.3391929864883423,
+      "learning_rate": 1.258984505031348e-05,
+      "loss": 0.243,
+      "step": 7000
+    },
+    {
+      "epoch": 0.8073992646222661,
+      "grad_norm": 0.3427666127681732,
+      "learning_rate": 1.2523630192849175e-05,
+      "loss": 0.2436,
+      "step": 7100
+    },
+    {
+      "epoch": 0.8187710852507486,
+      "grad_norm": 0.2746070921421051,
+      "learning_rate": 1.2456697384780872e-05,
+      "loss": 0.2428,
+      "step": 7200
+    },
+    {
+      "epoch": 0.8301429058792312,
+      "grad_norm": 0.3383966088294983,
+      "learning_rate": 1.2389056191790781e-05,
+      "loss": 0.2431,
+      "step": 7300
+    },
+    {
+      "epoch": 0.8415147265077139,
+      "grad_norm": 0.3772919774055481,
+      "learning_rate": 1.2320716280799739e-05,
+      "loss": 0.2445,
+      "step": 7400
+    },
+    {
+      "epoch": 0.8528865471361965,
+      "grad_norm": 0.3445940911769867,
+      "learning_rate": 1.2251687418585649e-05,
+      "loss": 0.2425,
+      "step": 7500
+    },
+    {
+      "epoch": 0.8642583677646791,
+      "grad_norm": 0.3363341689109802,
+      "learning_rate": 1.2181979470387674e-05,
+      "loss": 0.2424,
+      "step": 7600
+    },
+    {
+      "epoch": 0.8756301883931618,
+      "grad_norm": 0.3137916922569275,
+      "learning_rate": 1.2111602398496347e-05,
+      "loss": 0.2407,
+      "step": 7700
+    },
+    {
+      "epoch": 0.8870020090216444,
+      "grad_norm": 0.34211379289627075,
+      "learning_rate": 1.2040566260829813e-05,
+      "loss": 0.2425,
+      "step": 7800
+    },
+    {
+      "epoch": 0.898373829650127,
+      "grad_norm": 0.38583695888519287,
+      "learning_rate": 1.1968881209496406e-05,
+      "loss": 0.2422,
+      "step": 7900
+    },
+    {
+      "epoch": 0.9097456502786097,
+      "grad_norm": 0.33044618368148804,
+      "learning_rate": 1.189655748934376e-05,
+      "loss": 0.242,
+      "step": 8000
+    },
+    {
+      "epoch": 0.9211174709070922,
+      "grad_norm": 0.34866610169410706,
+      "learning_rate": 1.1823605436494677e-05,
+      "loss": 0.2408,
+      "step": 8100
+    },
+    {
+      "epoch": 0.9324892915355748,
+      "grad_norm": 0.3553544878959656,
+      "learning_rate": 1.175003547686993e-05,
+      "loss": 0.2419,
+      "step": 8200
+    },
+    {
+      "epoch": 0.9438611121640574,
+      "grad_norm": 0.32875463366508484,
+      "learning_rate": 1.1675858124698262e-05,
+      "loss": 0.2413,
+      "step": 8300
+    },
+    {
+      "epoch": 0.9552329327925401,
+      "grad_norm": 0.2906413972377777,
+      "learning_rate": 1.1601083981013732e-05,
+      "loss": 0.243,
+      "step": 8400
+    },
+    {
+      "epoch": 0.9666047534210227,
+      "grad_norm": 0.3169972598552704,
+      "learning_rate": 1.1525723732140687e-05,
+      "loss": 0.24,
+      "step": 8500
+    },
+    {
+      "epoch": 0.9779765740495053,
+      "grad_norm": 0.29645681381225586,
+      "learning_rate": 1.1449788148166514e-05,
+      "loss": 0.2416,
+      "step": 8600
+    },
+    {
+      "epoch": 0.989348394677988,
+      "grad_norm": 0.31245288252830505,
+      "learning_rate": 1.1373288081402454e-05,
+      "loss": 0.2416,
+      "step": 8700
+    },
+    {
+      "epoch": 0.9999241878624768,
+      "eval_loss": 0.240670308470726,
+      "eval_runtime": 8249.7109,
+      "eval_samples_per_second": 15.075,
+      "eval_steps_per_second": 1.371,
+      "step": 8793
+    },
+    {
+      "epoch": 1.0007202153064705,
+      "grad_norm": 0.3536455035209656,
+      "learning_rate": 1.1296234464832622e-05,
+      "loss": 0.2409,
+      "step": 8800
+    },
+    {
+      "epoch": 1.0120920359349532,
+      "grad_norm": 0.3137208819389343,
+      "learning_rate": 1.1218638310551549e-05,
+      "loss": 0.2403,
+      "step": 8900
+    },
+    {
+      "epoch": 1.0234638565634357,
+      "grad_norm": 0.3500869870185852,
+      "learning_rate": 1.1140510708190381e-05,
+      "loss": 0.2403,
+      "step": 9000
+    },
+    {
+      "epoch": 1.0348356771919185,
+      "grad_norm": 0.36828961968421936,
+      "learning_rate": 1.1061862823331999e-05,
+      "loss": 0.2407,
+      "step": 9100
+    },
+    {
+      "epoch": 1.046207497820401,
+      "grad_norm": 0.31094878911972046,
+      "learning_rate": 1.098270589591531e-05,
+      "loss": 0.2382,
+      "step": 9200
+    },
+    {
+      "epoch": 1.0575793184488838,
+      "grad_norm": 0.34195277094841003,
+      "learning_rate": 1.0903051238628875e-05,
+      "loss": 0.2401,
+      "step": 9300
+    },
+    {
+      "epoch": 1.0689511390773663,
+      "grad_norm": 0.3708897531032562,
+      "learning_rate": 1.0822910235294182e-05,
+      "loss": 0.24,
+      "step": 9400
+    },
+    {
+      "epoch": 1.0803229597058488,
+      "grad_norm": 0.3012748062610626,
+      "learning_rate": 1.0742294339238709e-05,
+      "loss": 0.2399,
+      "step": 9500
+    },
+    {
+      "epoch": 1.0916947803343315,
+      "grad_norm": 0.30334606766700745,
+      "learning_rate": 1.0661215071659094e-05,
+      "loss": 0.2393,
+      "step": 9600
+    },
+    {
+      "epoch": 1.103066600962814,
+      "grad_norm": 0.30188488960266113,
+      "learning_rate": 1.0579684019974573e-05,
+      "loss": 0.238,
+      "step": 9700
+    },
+    {
+      "epoch": 1.1144384215912968,
+      "grad_norm": 0.3512589931488037,
+      "learning_rate": 1.0497712836170965e-05,
+      "loss": 0.2387,
+      "step": 9800
+    },
+    {
+      "epoch": 1.1258102422197793,
+      "grad_norm": 0.30085045099258423,
+      "learning_rate": 1.0415313235135456e-05,
+      "loss": 0.2382,
+      "step": 9900
+    },
+    {
+      "epoch": 1.137182062848262,
+      "grad_norm": 0.3223400413990021,
+      "learning_rate": 1.0332496992982332e-05,
+      "loss": 0.2395,
+      "step": 10000
+    }
+  ],
+  "logging_steps": 100,
+  "max_steps": 26379,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 3,
+  "save_steps": 1000,
+  "stateful_callbacks": {
+    "TrainerControl": {
+      "args": {
+        "should_epoch_stop": false,
+        "should_evaluate": false,
+        "should_log": false,
+        "should_save": true,
+        "should_training_stop": false
+      },
+      "attributes": {}
+    }
+  },
+  "total_flos": 5.67724995060695e+18,
+  "train_batch_size": 11,
+  "trial_name": null,
+  "trial_params": null
+}

full_final-tesis/checkpoint-10000/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:58a8dc2e9cd30fabd0bf191d8b86269c1d642655415287312de73d5ceda9dace
+size 5176

full_final-tesis/checkpoint-11000/README.md ADDED Viewed

	@@ -0,0 +1,202 @@

+---
+base_model: mistralai/Mistral-7B-Instruct-v0.3
+library_name: peft
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.12.1.dev0

full_final-tesis/checkpoint-11000/adapter_config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "mistralai/Mistral-7B-Instruct-v0.3",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 16,
+  "lora_dropout": 0.1,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 64,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "q_proj",
+    "v_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM",
+  "use_dora": false,
+  "use_rslora": false
+}

full_final-tesis/checkpoint-11000/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:27c272f900c3e9965dc2818076a1ccd2fd2ed7272a4e6c52ad4ed7f84d6fb370
+size 218138576

full_final-tesis/checkpoint-11000/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:993a7695ecd273df8a7aa3fd99fbd8c8e642244f7fa597cb51d6ccc7542ce3dd
+size 109570132

full_final-tesis/checkpoint-11000/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:203ad205900d2f29a444e607c4ae2431640b5cb223c731e4d00bf07171438ad7
+size 14244

full_final-tesis/checkpoint-11000/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:babebb436e6ebfa5ef8d5b609c26add37a1bff1e6fd9d2e87ffe891e94a8490b
+size 1064

full_final-tesis/checkpoint-11000/trainer_state.json ADDED Viewed

	@@ -0,0 +1,811 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 1.2509002691330882,
+  "eval_steps": 500,
+  "global_step": 11000,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.01137182062848262,
+      "grad_norm": 0.28454989194869995,
+      "learning_rate": 1.5e-05,
+      "loss": 0.8172,
+      "step": 100
+    },
+    {
+      "epoch": 0.02274364125696524,
+      "grad_norm": 0.3702525198459625,
+      "learning_rate": 1.499946406987345e-05,
+      "loss": 0.3711,
+      "step": 200
+    },
+    {
+      "epoch": 0.03411546188544786,
+      "grad_norm": 0.42058178782463074,
+      "learning_rate": 1.4997856356086094e-05,
+      "loss": 0.3401,
+      "step": 300
+    },
+    {
+      "epoch": 0.04548728251393048,
+      "grad_norm": 0.4536089599132538,
+      "learning_rate": 1.4995177088403865e-05,
+      "loss": 0.3276,
+      "step": 400
+    },
+    {
+      "epoch": 0.0568591031424131,
+      "grad_norm": 0.3354353904724121,
+      "learning_rate": 1.4991426649733503e-05,
+      "loss": 0.3191,
+      "step": 500
+    },
+    {
+      "epoch": 0.06823092377089572,
+      "grad_norm": 0.3974071741104126,
+      "learning_rate": 1.4986605576067824e-05,
+      "loss": 0.3138,
+      "step": 600
+    },
+    {
+      "epoch": 0.07960274439937834,
+      "grad_norm": 0.3995005786418915,
+      "learning_rate": 1.4980714556409132e-05,
+      "loss": 0.3027,
+      "step": 700
+    },
+    {
+      "epoch": 0.09097456502786096,
+      "grad_norm": 0.35224294662475586,
+      "learning_rate": 1.4973754432670731e-05,
+      "loss": 0.2784,
+      "step": 800
+    },
+    {
+      "epoch": 0.10234638565634357,
+      "grad_norm": 0.28719601035118103,
+      "learning_rate": 1.4965726199556621e-05,
+      "loss": 0.2737,
+      "step": 900
+    },
+    {
+      "epoch": 0.1137182062848262,
+      "grad_norm": 0.32742446660995483,
+      "learning_rate": 1.4956631004419335e-05,
+      "loss": 0.2722,
+      "step": 1000
+    },
+    {
+      "epoch": 0.1250900269133088,
+      "grad_norm": 0.44312119483947754,
+      "learning_rate": 1.4946470147095961e-05,
+      "loss": 0.2709,
+      "step": 1100
+    },
+    {
+      "epoch": 0.13646184754179144,
+      "grad_norm": 0.40551960468292236,
+      "learning_rate": 1.4935245079722374e-05,
+      "loss": 0.2683,
+      "step": 1200
+    },
+    {
+      "epoch": 0.14783366817027407,
+      "grad_norm": 0.38514164090156555,
+      "learning_rate": 1.4922957406525721e-05,
+      "loss": 0.2667,
+      "step": 1300
+    },
+    {
+      "epoch": 0.15920548879875668,
+      "grad_norm": 0.4830244481563568,
+      "learning_rate": 1.4909608883595135e-05,
+      "loss": 0.2644,
+      "step": 1400
+    },
+    {
+      "epoch": 0.1705773094272393,
+      "grad_norm": 0.5327072739601135,
+      "learning_rate": 1.489520141863077e-05,
+      "loss": 0.265,
+      "step": 1500
+    },
+    {
+      "epoch": 0.1819491300557219,
+      "grad_norm": 0.3729603588581085,
+      "learning_rate": 1.4879737070671164e-05,
+      "loss": 0.2625,
+      "step": 1600
+    },
+    {
+      "epoch": 0.19332095068420455,
+      "grad_norm": 0.3948141634464264,
+      "learning_rate": 1.4863218049798972e-05,
+      "loss": 0.2632,
+      "step": 1700
+    },
+    {
+      "epoch": 0.20469277131268715,
+      "grad_norm": 0.42538341879844666,
+      "learning_rate": 1.4845646716825118e-05,
+      "loss": 0.2628,
+      "step": 1800
+    },
+    {
+      "epoch": 0.21606459194116978,
+      "grad_norm": 0.39089226722717285,
+      "learning_rate": 1.4827025582951387e-05,
+      "loss": 0.2604,
+      "step": 1900
+    },
+    {
+      "epoch": 0.2274364125696524,
+      "grad_norm": 0.44530946016311646,
+      "learning_rate": 1.4807357309411546e-05,
+      "loss": 0.2595,
+      "step": 2000
+    },
+    {
+      "epoch": 0.23880823319813502,
+      "grad_norm": 0.5049150586128235,
+      "learning_rate": 1.4786644707091018e-05,
+      "loss": 0.2602,
+      "step": 2100
+    },
+    {
+      "epoch": 0.2501800538266176,
+      "grad_norm": 0.4439440369606018,
+      "learning_rate": 1.4764890736125158e-05,
+      "loss": 0.2591,
+      "step": 2200
+    },
+    {
+      "epoch": 0.26155187445510025,
+      "grad_norm": 0.3611554801464081,
+      "learning_rate": 1.4742098505476209e-05,
+      "loss": 0.2577,
+      "step": 2300
+    },
+    {
+      "epoch": 0.2729236950835829,
+      "grad_norm": 0.5148597359657288,
+      "learning_rate": 1.4718271272488986e-05,
+      "loss": 0.2572,
+      "step": 2400
+    },
+    {
+      "epoch": 0.2842955157120655,
+      "grad_norm": 0.4029059410095215,
+      "learning_rate": 1.4693412442425354e-05,
+      "loss": 0.2572,
+      "step": 2500
+    },
+    {
+      "epoch": 0.29566733634054815,
+      "grad_norm": 0.3884616196155548,
+      "learning_rate": 1.4667525567977561e-05,
+      "loss": 0.2555,
+      "step": 2600
+    },
+    {
+      "epoch": 0.3070391569690307,
+      "grad_norm": 0.32239723205566406,
+      "learning_rate": 1.4640614348760517e-05,
+      "loss": 0.2551,
+      "step": 2700
+    },
+    {
+      "epoch": 0.31841097759751336,
+      "grad_norm": 0.36961373686790466,
+      "learning_rate": 1.4612682630783053e-05,
+      "loss": 0.256,
+      "step": 2800
+    },
+    {
+      "epoch": 0.329782798225996,
+      "grad_norm": 0.44517841935157776,
+      "learning_rate": 1.4583734405898277e-05,
+      "loss": 0.2548,
+      "step": 2900
+    },
+    {
+      "epoch": 0.3411546188544786,
+      "grad_norm": 0.36591875553131104,
+      "learning_rate": 1.4553773811233073e-05,
+      "loss": 0.2544,
+      "step": 3000
+    },
+    {
+      "epoch": 0.3525264394829612,
+      "grad_norm": 0.39926204085350037,
+      "learning_rate": 1.4522805128596852e-05,
+      "loss": 0.2551,
+      "step": 3100
+    },
+    {
+      "epoch": 0.3638982601114438,
+      "grad_norm": 0.4806652367115021,
+      "learning_rate": 1.4490832783869617e-05,
+      "loss": 0.2539,
+      "step": 3200
+    },
+    {
+      "epoch": 0.37527008073992646,
+      "grad_norm": 0.37937217950820923,
+      "learning_rate": 1.4457861346369439e-05,
+      "loss": 0.2549,
+      "step": 3300
+    },
+    {
+      "epoch": 0.3866419013684091,
+      "grad_norm": 0.32821956276893616,
+      "learning_rate": 1.4423895528199423e-05,
+      "loss": 0.2531,
+      "step": 3400
+    },
+    {
+      "epoch": 0.3980137219968917,
+      "grad_norm": 0.4085944890975952,
+      "learning_rate": 1.4388940183574303e-05,
+      "loss": 0.2522,
+      "step": 3500
+    },
+    {
+      "epoch": 0.4093855426253743,
+      "grad_norm": 0.42145103216171265,
+      "learning_rate": 1.4353000308126683e-05,
+      "loss": 0.2525,
+      "step": 3600
+    },
+    {
+      "epoch": 0.42075736325385693,
+      "grad_norm": 0.39684563875198364,
+      "learning_rate": 1.4316081038193093e-05,
+      "loss": 0.2512,
+      "step": 3700
+    },
+    {
+      "epoch": 0.43212918388233956,
+      "grad_norm": 0.40946340560913086,
+      "learning_rate": 1.4278187650079938e-05,
+      "loss": 0.2513,
+      "step": 3800
+    },
+    {
+      "epoch": 0.4435010045108222,
+      "grad_norm": 0.36055612564086914,
+      "learning_rate": 1.4239325559309426e-05,
+      "loss": 0.2508,
+      "step": 3900
+    },
+    {
+      "epoch": 0.4548728251393048,
+      "grad_norm": 0.33412662148475647,
+      "learning_rate": 1.4199500319845618e-05,
+      "loss": 0.2521,
+      "step": 4000
+    },
+    {
+      "epoch": 0.4662446457677874,
+      "grad_norm": 0.435330867767334,
+      "learning_rate": 1.415871762330068e-05,
+      "loss": 0.2502,
+      "step": 4100
+    },
+    {
+      "epoch": 0.47761646639627003,
+      "grad_norm": 0.349936306476593,
+      "learning_rate": 1.4116983298121471e-05,
+      "loss": 0.25,
+      "step": 4200
+    },
+    {
+      "epoch": 0.48898828702475267,
+      "grad_norm": 0.4145605266094208,
+      "learning_rate": 1.407430330875657e-05,
+      "loss": 0.2486,
+      "step": 4300
+    },
+    {
+      "epoch": 0.5003601076532352,
+      "grad_norm": 0.37989580631256104,
+      "learning_rate": 1.4030683754803873e-05,
+      "loss": 0.2493,
+      "step": 4400
+    },
+    {
+      "epoch": 0.5117319282817179,
+      "grad_norm": 0.31419941782951355,
+      "learning_rate": 1.3986130870138861e-05,
+      "loss": 0.249,
+      "step": 4500
+    },
+    {
+      "epoch": 0.5231037489102005,
+      "grad_norm": 0.34506115317344666,
+      "learning_rate": 1.3940651022023705e-05,
+      "loss": 0.2505,
+      "step": 4600
+    },
+    {
+      "epoch": 0.5344755695386831,
+      "grad_norm": 0.3678961396217346,
+      "learning_rate": 1.3894250710197268e-05,
+      "loss": 0.2478,
+      "step": 4700
+    },
+    {
+      "epoch": 0.5458473901671658,
+      "grad_norm": 0.3394433557987213,
+      "learning_rate": 1.3846936565946217e-05,
+      "loss": 0.2486,
+      "step": 4800
+    },
+    {
+      "epoch": 0.5572192107956484,
+      "grad_norm": 0.3915737569332123,
+      "learning_rate": 1.3798715351157302e-05,
+      "loss": 0.2478,
+      "step": 4900
+    },
+    {
+      "epoch": 0.568591031424131,
+      "grad_norm": 0.34704986214637756,
+      "learning_rate": 1.3749593957350986e-05,
+      "loss": 0.2485,
+      "step": 5000
+    },
+    {
+      "epoch": 0.5799628520526137,
+      "grad_norm": 0.3951033353805542,
+      "learning_rate": 1.369957940469655e-05,
+      "loss": 0.2483,
+      "step": 5100
+    },
+    {
+      "epoch": 0.5913346726810963,
+      "grad_norm": 0.3708275556564331,
+      "learning_rate": 1.3648678841008805e-05,
+      "loss": 0.2455,
+      "step": 5200
+    },
+    {
+      "epoch": 0.6027064933095788,
+      "grad_norm": 0.3068831264972687,
+      "learning_rate": 1.3596899540726558e-05,
+      "loss": 0.2477,
+      "step": 5300
+    },
+    {
+      "epoch": 0.6140783139380614,
+      "grad_norm": 0.3610328137874603,
+      "learning_rate": 1.3544248903872996e-05,
+      "loss": 0.2475,
+      "step": 5400
+    },
+    {
+      "epoch": 0.6254501345665441,
+      "grad_norm": 0.4052928388118744,
+      "learning_rate": 1.3490734454998117e-05,
+      "loss": 0.2457,
+      "step": 5500
+    },
+    {
+      "epoch": 0.6368219551950267,
+      "grad_norm": 0.36340612173080444,
+      "learning_rate": 1.3436363842103345e-05,
+      "loss": 0.2469,
+      "step": 5600
+    },
+    {
+      "epoch": 0.6481937758235093,
+      "grad_norm": 0.33248743414878845,
+      "learning_rate": 1.3381144835548534e-05,
+      "loss": 0.2466,
+      "step": 5700
+    },
+    {
+      "epoch": 0.659565596451992,
+      "grad_norm": 0.32214176654815674,
+      "learning_rate": 1.3325085326941464e-05,
+      "loss": 0.2457,
+      "step": 5800
+    },
+    {
+      "epoch": 0.6709374170804746,
+      "grad_norm": 0.45882582664489746,
+      "learning_rate": 1.3268193328010013e-05,
+      "loss": 0.2436,
+      "step": 5900
+    },
+    {
+      "epoch": 0.6823092377089572,
+      "grad_norm": 0.3200742304325104,
+      "learning_rate": 1.321047696945716e-05,
+      "loss": 0.2459,
+      "step": 6000
+    },
+    {
+      "epoch": 0.6936810583374399,
+      "grad_norm": 0.3391638696193695,
+      "learning_rate": 1.3151944499799003e-05,
+      "loss": 0.2461,
+      "step": 6100
+    },
+    {
+      "epoch": 0.7050528789659224,
+      "grad_norm": 0.34882161021232605,
+      "learning_rate": 1.3092604284185901e-05,
+      "loss": 0.2455,
+      "step": 6200
+    },
+    {
+      "epoch": 0.716424699594405,
+      "grad_norm": 0.31918397545814514,
+      "learning_rate": 1.3032464803206998e-05,
+      "loss": 0.2438,
+      "step": 6300
+    },
+    {
+      "epoch": 0.7277965202228877,
+      "grad_norm": 0.308662474155426,
+      "learning_rate": 1.2971534651678194e-05,
+      "loss": 0.2451,
+      "step": 6400
+    },
+    {
+      "epoch": 0.7391683408513703,
+      "grad_norm": 0.36811548471450806,
+      "learning_rate": 1.2909822537413848e-05,
+      "loss": 0.2448,
+      "step": 6500
+    },
+    {
+      "epoch": 0.7505401614798529,
+      "grad_norm": 0.3129611611366272,
+      "learning_rate": 1.2847337279982274e-05,
+      "loss": 0.2441,
+      "step": 6600
+    },
+    {
+      "epoch": 0.7619119821083356,
+      "grad_norm": 0.31695765256881714,
+      "learning_rate": 1.2784087809445326e-05,
+      "loss": 0.2434,
+      "step": 6700
+    },
+    {
+      "epoch": 0.7732838027368182,
+      "grad_norm": 0.3517361879348755,
+      "learning_rate": 1.2720083165082133e-05,
+      "loss": 0.2444,
+      "step": 6800
+    },
+    {
+      "epoch": 0.7846556233653008,
+      "grad_norm": 0.3190363943576813,
+      "learning_rate": 1.2655332494097267e-05,
+      "loss": 0.2452,
+      "step": 6900
+    },
+    {
+      "epoch": 0.7960274439937834,
+      "grad_norm": 0.3391929864883423,
+      "learning_rate": 1.258984505031348e-05,
+      "loss": 0.243,
+      "step": 7000
+    },
+    {
+      "epoch": 0.8073992646222661,
+      "grad_norm": 0.3427666127681732,
+      "learning_rate": 1.2523630192849175e-05,
+      "loss": 0.2436,
+      "step": 7100
+    },
+    {
+      "epoch": 0.8187710852507486,
+      "grad_norm": 0.2746070921421051,
+      "learning_rate": 1.2456697384780872e-05,
+      "loss": 0.2428,
+      "step": 7200
+    },
+    {
+      "epoch": 0.8301429058792312,
+      "grad_norm": 0.3383966088294983,
+      "learning_rate": 1.2389056191790781e-05,
+      "loss": 0.2431,
+      "step": 7300
+    },
+    {
+      "epoch": 0.8415147265077139,
+      "grad_norm": 0.3772919774055481,
+      "learning_rate": 1.2320716280799739e-05,
+      "loss": 0.2445,
+      "step": 7400
+    },
+    {
+      "epoch": 0.8528865471361965,
+      "grad_norm": 0.3445940911769867,
+      "learning_rate": 1.2251687418585649e-05,
+      "loss": 0.2425,
+      "step": 7500
+    },
+    {
+      "epoch": 0.8642583677646791,
+      "grad_norm": 0.3363341689109802,
+      "learning_rate": 1.2181979470387674e-05,
+      "loss": 0.2424,
+      "step": 7600
+    },
+    {
+      "epoch": 0.8756301883931618,
+      "grad_norm": 0.3137916922569275,
+      "learning_rate": 1.2111602398496347e-05,
+      "loss": 0.2407,
+      "step": 7700
+    },
+    {
+      "epoch": 0.8870020090216444,
+      "grad_norm": 0.34211379289627075,
+      "learning_rate": 1.2040566260829813e-05,
+      "loss": 0.2425,
+      "step": 7800
+    },
+    {
+      "epoch": 0.898373829650127,
+      "grad_norm": 0.38583695888519287,
+      "learning_rate": 1.1968881209496406e-05,
+      "loss": 0.2422,
+      "step": 7900
+    },
+    {
+      "epoch": 0.9097456502786097,
+      "grad_norm": 0.33044618368148804,
+      "learning_rate": 1.189655748934376e-05,
+      "loss": 0.242,
+      "step": 8000
+    },
+    {
+      "epoch": 0.9211174709070922,
+      "grad_norm": 0.34866610169410706,
+      "learning_rate": 1.1823605436494677e-05,
+      "loss": 0.2408,
+      "step": 8100
+    },
+    {
+      "epoch": 0.9324892915355748,
+      "grad_norm": 0.3553544878959656,
+      "learning_rate": 1.175003547686993e-05,
+      "loss": 0.2419,
+      "step": 8200
+    },
+    {
+      "epoch": 0.9438611121640574,
+      "grad_norm": 0.32875463366508484,
+      "learning_rate": 1.1675858124698262e-05,
+      "loss": 0.2413,
+      "step": 8300
+    },
+    {
+      "epoch": 0.9552329327925401,
+      "grad_norm": 0.2906413972377777,
+      "learning_rate": 1.1601083981013732e-05,
+      "loss": 0.243,
+      "step": 8400
+    },
+    {
+      "epoch": 0.9666047534210227,
+      "grad_norm": 0.3169972598552704,
+      "learning_rate": 1.1525723732140687e-05,
+      "loss": 0.24,
+      "step": 8500
+    },
+    {
+      "epoch": 0.9779765740495053,
+      "grad_norm": 0.29645681381225586,
+      "learning_rate": 1.1449788148166514e-05,
+      "loss": 0.2416,
+      "step": 8600
+    },
+    {
+      "epoch": 0.989348394677988,
+      "grad_norm": 0.31245288252830505,
+      "learning_rate": 1.1373288081402454e-05,
+      "loss": 0.2416,
+      "step": 8700
+    },
+    {
+      "epoch": 0.9999241878624768,
+      "eval_loss": 0.240670308470726,
+      "eval_runtime": 8249.7109,
+      "eval_samples_per_second": 15.075,
+      "eval_steps_per_second": 1.371,
+      "step": 8793
+    },
+    {
+      "epoch": 1.0007202153064705,
+      "grad_norm": 0.3536455035209656,
+      "learning_rate": 1.1296234464832622e-05,
+      "loss": 0.2409,
+      "step": 8800
+    },
+    {
+      "epoch": 1.0120920359349532,
+      "grad_norm": 0.3137208819389343,
+      "learning_rate": 1.1218638310551549e-05,
+      "loss": 0.2403,
+      "step": 8900
+    },
+    {
+      "epoch": 1.0234638565634357,
+      "grad_norm": 0.3500869870185852,
+      "learning_rate": 1.1140510708190381e-05,
+      "loss": 0.2403,
+      "step": 9000
+    },
+    {
+      "epoch": 1.0348356771919185,
+      "grad_norm": 0.36828961968421936,
+      "learning_rate": 1.1061862823331999e-05,
+      "loss": 0.2407,
+      "step": 9100
+    },
+    {
+      "epoch": 1.046207497820401,
+      "grad_norm": 0.31094878911972046,
+      "learning_rate": 1.098270589591531e-05,
+      "loss": 0.2382,
+      "step": 9200
+    },
+    {
+      "epoch": 1.0575793184488838,
+      "grad_norm": 0.34195277094841003,
+      "learning_rate": 1.0903051238628875e-05,
+      "loss": 0.2401,
+      "step": 9300
+    },
+    {
+      "epoch": 1.0689511390773663,
+      "grad_norm": 0.3708897531032562,
+      "learning_rate": 1.0822910235294182e-05,
+      "loss": 0.24,
+      "step": 9400
+    },
+    {
+      "epoch": 1.0803229597058488,
+      "grad_norm": 0.3012748062610626,
+      "learning_rate": 1.0742294339238709e-05,
+      "loss": 0.2399,
+      "step": 9500
+    },
+    {
+      "epoch": 1.0916947803343315,
+      "grad_norm": 0.30334606766700745,
+      "learning_rate": 1.0661215071659094e-05,
+      "loss": 0.2393,
+      "step": 9600
+    },
+    {
+      "epoch": 1.103066600962814,
+      "grad_norm": 0.30188488960266113,
+      "learning_rate": 1.0579684019974573e-05,
+      "loss": 0.238,
+      "step": 9700
+    },
+    {
+      "epoch": 1.1144384215912968,
+      "grad_norm": 0.3512589931488037,
+      "learning_rate": 1.0497712836170965e-05,
+      "loss": 0.2387,
+      "step": 9800
+    },
+    {
+      "epoch": 1.1258102422197793,
+      "grad_norm": 0.30085045099258423,
+      "learning_rate": 1.0415313235135456e-05,
+      "loss": 0.2382,
+      "step": 9900
+    },
+    {
+      "epoch": 1.137182062848262,
+      "grad_norm": 0.3223400413990021,
+      "learning_rate": 1.0332496992982332e-05,
+      "loss": 0.2395,
+      "step": 10000
+    },
+    {
+      "epoch": 1.1485538834767446,
+      "grad_norm": 0.3234786093235016,
+      "learning_rate": 1.0249275945370035e-05,
+      "loss": 0.2378,
+      "step": 10100
+    },
+    {
+      "epoch": 1.1599257041052273,
+      "grad_norm": 0.3373458981513977,
+      "learning_rate": 1.0165661985809653e-05,
+      "loss": 0.2375,
+      "step": 10200
+    },
+    {
+      "epoch": 1.1712975247337098,
+      "grad_norm": 0.31131288409233093,
+      "learning_rate": 1.0081667063965164e-05,
+      "loss": 0.2384,
+      "step": 10300
+    },
+    {
+      "epoch": 1.1826693453621924,
+      "grad_norm": 0.28763946890830994,
+      "learning_rate": 9.997303183945664e-06,
+      "loss": 0.2368,
+      "step": 10400
+    },
+    {
+      "epoch": 1.1940411659906751,
+      "grad_norm": 0.34528255462646484,
+      "learning_rate": 9.912582402589786e-06,
+      "loss": 0.2385,
+      "step": 10500
+    },
+    {
+      "epoch": 1.2054129866191576,
+      "grad_norm": 0.36961841583251953,
+      "learning_rate": 9.827516827742623e-06,
+      "loss": 0.2392,
+      "step": 10600
+    },
+    {
+      "epoch": 1.2167848072476404,
+      "grad_norm": 0.3676556348800659,
+      "learning_rate": 9.742118616525315e-06,
+      "loss": 0.2388,
+      "step": 10700
+    },
+    {
+      "epoch": 1.228156627876123,
+      "grad_norm": 0.31305503845214844,
+      "learning_rate": 9.656399973597634e-06,
+      "loss": 0.2386,
+      "step": 10800
+    },
+    {
+      "epoch": 1.2395284485046056,
+      "grad_norm": 0.3261184096336365,
+      "learning_rate": 9.570373149413758e-06,
+      "loss": 0.2388,
+      "step": 10900
+    },
+    {
+      "epoch": 1.2509002691330882,
+      "grad_norm": 0.2724168002605438,
+      "learning_rate": 9.484050438471485e-06,
+      "loss": 0.2382,
+      "step": 11000
+    }
+  ],
+  "logging_steps": 100,
+  "max_steps": 26379,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 3,
+  "save_steps": 1000,
+  "stateful_callbacks": {
+    "TrainerControl": {
+      "args": {
+        "should_epoch_stop": false,
+        "should_evaluate": false,
+        "should_log": false,
+        "should_save": true,
+        "should_training_stop": false
+      },
+      "attributes": {}
+    }
+  },
+  "total_flos": 6.24498182726615e+18,
+  "train_batch_size": 11,
+  "trial_name": null,
+  "trial_params": null
+}

full_final-tesis/checkpoint-11000/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:58a8dc2e9cd30fabd0bf191d8b86269c1d642655415287312de73d5ceda9dace
+size 5176

full_final-tesis/checkpoint-12000/README.md ADDED Viewed

	@@ -0,0 +1,202 @@

+---
+base_model: mistralai/Mistral-7B-Instruct-v0.3
+library_name: peft
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.12.1.dev0

full_final-tesis/checkpoint-12000/adapter_config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "mistralai/Mistral-7B-Instruct-v0.3",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 16,
+  "lora_dropout": 0.1,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 64,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "q_proj",
+    "v_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM",
+  "use_dora": false,
+  "use_rslora": false
+}

full_final-tesis/checkpoint-12000/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8c8070441bf33824b8a76807504d41766e32f4e142c4a9a180479f3a6ebf02ee
+size 218138576

full_final-tesis/checkpoint-12000/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2ba966e446b9d0527b508b7ff7edfae24d2e9d580ce1a3b4a165fba73aa0931d
+size 109570132

full_final-tesis/checkpoint-12000/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:703318c694052c57b29614ac6820a566102d36ec671d7f5e397ab06100429e90
+size 14244

full_final-tesis/checkpoint-12000/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0d4b20ff287b276a2275f6b5a3f562373d17356381a0665a7a67ab517a6a50b3
+size 1064

full_final-tesis/checkpoint-12000/trainer_state.json ADDED Viewed

	@@ -0,0 +1,881 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 1.3646184754179145,
+  "eval_steps": 500,
+  "global_step": 12000,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.01137182062848262,
+      "grad_norm": 0.28454989194869995,
+      "learning_rate": 1.5e-05,
+      "loss": 0.8172,
+      "step": 100
+    },
+    {
+      "epoch": 0.02274364125696524,
+      "grad_norm": 0.3702525198459625,
+      "learning_rate": 1.499946406987345e-05,
+      "loss": 0.3711,
+      "step": 200
+    },
+    {
+      "epoch": 0.03411546188544786,
+      "grad_norm": 0.42058178782463074,
+      "learning_rate": 1.4997856356086094e-05,
+      "loss": 0.3401,
+      "step": 300
+    },
+    {
+      "epoch": 0.04548728251393048,
+      "grad_norm": 0.4536089599132538,
+      "learning_rate": 1.4995177088403865e-05,
+      "loss": 0.3276,
+      "step": 400
+    },
+    {
+      "epoch": 0.0568591031424131,
+      "grad_norm": 0.3354353904724121,
+      "learning_rate": 1.4991426649733503e-05,
+      "loss": 0.3191,
+      "step": 500
+    },
+    {
+      "epoch": 0.06823092377089572,
+      "grad_norm": 0.3974071741104126,
+      "learning_rate": 1.4986605576067824e-05,
+      "loss": 0.3138,
+      "step": 600
+    },
+    {
+      "epoch": 0.07960274439937834,
+      "grad_norm": 0.3995005786418915,
+      "learning_rate": 1.4980714556409132e-05,
+      "loss": 0.3027,
+      "step": 700
+    },
+    {
+      "epoch": 0.09097456502786096,
+      "grad_norm": 0.35224294662475586,
+      "learning_rate": 1.4973754432670731e-05,
+      "loss": 0.2784,
+      "step": 800
+    },
+    {
+      "epoch": 0.10234638565634357,
+      "grad_norm": 0.28719601035118103,
+      "learning_rate": 1.4965726199556621e-05,
+      "loss": 0.2737,
+      "step": 900
+    },
+    {
+      "epoch": 0.1137182062848262,
+      "grad_norm": 0.32742446660995483,
+      "learning_rate": 1.4956631004419335e-05,
+      "loss": 0.2722,
+      "step": 1000
+    },
+    {
+      "epoch": 0.1250900269133088,
+      "grad_norm": 0.44312119483947754,
+      "learning_rate": 1.4946470147095961e-05,
+      "loss": 0.2709,
+      "step": 1100
+    },
+    {
+      "epoch": 0.13646184754179144,
+      "grad_norm": 0.40551960468292236,
+      "learning_rate": 1.4935245079722374e-05,
+      "loss": 0.2683,
+      "step": 1200
+    },
+    {
+      "epoch": 0.14783366817027407,
+      "grad_norm": 0.38514164090156555,
+      "learning_rate": 1.4922957406525721e-05,
+      "loss": 0.2667,
+      "step": 1300
+    },
+    {
+      "epoch": 0.15920548879875668,
+      "grad_norm": 0.4830244481563568,
+      "learning_rate": 1.4909608883595135e-05,
+      "loss": 0.2644,
+      "step": 1400
+    },
+    {
+      "epoch": 0.1705773094272393,
+      "grad_norm": 0.5327072739601135,
+      "learning_rate": 1.489520141863077e-05,
+      "loss": 0.265,
+      "step": 1500
+    },
+    {
+      "epoch": 0.1819491300557219,
+      "grad_norm": 0.3729603588581085,
+      "learning_rate": 1.4879737070671164e-05,
+      "loss": 0.2625,
+      "step": 1600
+    },
+    {
+      "epoch": 0.19332095068420455,
+      "grad_norm": 0.3948141634464264,
+      "learning_rate": 1.4863218049798972e-05,
+      "loss": 0.2632,
+      "step": 1700
+    },
+    {
+      "epoch": 0.20469277131268715,
+      "grad_norm": 0.42538341879844666,
+      "learning_rate": 1.4845646716825118e-05,
+      "loss": 0.2628,
+      "step": 1800
+    },
+    {
+      "epoch": 0.21606459194116978,
+      "grad_norm": 0.39089226722717285,
+      "learning_rate": 1.4827025582951387e-05,
+      "loss": 0.2604,
+      "step": 1900
+    },
+    {
+      "epoch": 0.2274364125696524,
+      "grad_norm": 0.44530946016311646,
+      "learning_rate": 1.4807357309411546e-05,
+      "loss": 0.2595,
+      "step": 2000
+    },
+    {
+      "epoch": 0.23880823319813502,
+      "grad_norm": 0.5049150586128235,
+      "learning_rate": 1.4786644707091018e-05,
+      "loss": 0.2602,
+      "step": 2100
+    },
+    {
+      "epoch": 0.2501800538266176,
+      "grad_norm": 0.4439440369606018,
+      "learning_rate": 1.4764890736125158e-05,
+      "loss": 0.2591,
+      "step": 2200
+    },
+    {
+      "epoch": 0.26155187445510025,
+      "grad_norm": 0.3611554801464081,
+      "learning_rate": 1.4742098505476209e-05,
+      "loss": 0.2577,
+      "step": 2300
+    },
+    {
+      "epoch": 0.2729236950835829,
+      "grad_norm": 0.5148597359657288,
+      "learning_rate": 1.4718271272488986e-05,
+      "loss": 0.2572,
+      "step": 2400
+    },
+    {
+      "epoch": 0.2842955157120655,
+      "grad_norm": 0.4029059410095215,
+      "learning_rate": 1.4693412442425354e-05,
+      "loss": 0.2572,
+      "step": 2500
+    },
+    {
+      "epoch": 0.29566733634054815,
+      "grad_norm": 0.3884616196155548,
+      "learning_rate": 1.4667525567977561e-05,
+      "loss": 0.2555,
+      "step": 2600
+    },
+    {
+      "epoch": 0.3070391569690307,
+      "grad_norm": 0.32239723205566406,
+      "learning_rate": 1.4640614348760517e-05,
+      "loss": 0.2551,
+      "step": 2700
+    },
+    {
+      "epoch": 0.31841097759751336,
+      "grad_norm": 0.36961373686790466,
+      "learning_rate": 1.4612682630783053e-05,
+      "loss": 0.256,
+      "step": 2800
+    },
+    {
+      "epoch": 0.329782798225996,
+      "grad_norm": 0.44517841935157776,
+      "learning_rate": 1.4583734405898277e-05,
+      "loss": 0.2548,
+      "step": 2900
+    },
+    {
+      "epoch": 0.3411546188544786,
+      "grad_norm": 0.36591875553131104,
+      "learning_rate": 1.4553773811233073e-05,
+      "loss": 0.2544,
+      "step": 3000
+    },
+    {
+      "epoch": 0.3525264394829612,
+      "grad_norm": 0.39926204085350037,
+      "learning_rate": 1.4522805128596852e-05,
+      "loss": 0.2551,
+      "step": 3100
+    },
+    {
+      "epoch": 0.3638982601114438,
+      "grad_norm": 0.4806652367115021,
+      "learning_rate": 1.4490832783869617e-05,
+      "loss": 0.2539,
+      "step": 3200
+    },
+    {
+      "epoch": 0.37527008073992646,
+      "grad_norm": 0.37937217950820923,
+      "learning_rate": 1.4457861346369439e-05,
+      "loss": 0.2549,
+      "step": 3300
+    },
+    {
+      "epoch": 0.3866419013684091,
+      "grad_norm": 0.32821956276893616,
+      "learning_rate": 1.4423895528199423e-05,
+      "loss": 0.2531,
+      "step": 3400
+    },
+    {
+      "epoch": 0.3980137219968917,
+      "grad_norm": 0.4085944890975952,
+      "learning_rate": 1.4388940183574303e-05,
+      "loss": 0.2522,
+      "step": 3500
+    },
+    {
+      "epoch": 0.4093855426253743,
+      "grad_norm": 0.42145103216171265,
+      "learning_rate": 1.4353000308126683e-05,
+      "loss": 0.2525,
+      "step": 3600
+    },
+    {
+      "epoch": 0.42075736325385693,
+      "grad_norm": 0.39684563875198364,
+      "learning_rate": 1.4316081038193093e-05,
+      "loss": 0.2512,
+      "step": 3700
+    },
+    {
+      "epoch": 0.43212918388233956,
+      "grad_norm": 0.40946340560913086,
+      "learning_rate": 1.4278187650079938e-05,
+      "loss": 0.2513,
+      "step": 3800
+    },
+    {
+      "epoch": 0.4435010045108222,
+      "grad_norm": 0.36055612564086914,
+      "learning_rate": 1.4239325559309426e-05,
+      "loss": 0.2508,
+      "step": 3900
+    },
+    {
+      "epoch": 0.4548728251393048,
+      "grad_norm": 0.33412662148475647,
+      "learning_rate": 1.4199500319845618e-05,
+      "loss": 0.2521,
+      "step": 4000
+    },
+    {
+      "epoch": 0.4662446457677874,
+      "grad_norm": 0.435330867767334,
+      "learning_rate": 1.415871762330068e-05,
+      "loss": 0.2502,
+      "step": 4100
+    },
+    {
+      "epoch": 0.47761646639627003,
+      "grad_norm": 0.349936306476593,
+      "learning_rate": 1.4116983298121471e-05,
+      "loss": 0.25,
+      "step": 4200
+    },
+    {
+      "epoch": 0.48898828702475267,
+      "grad_norm": 0.4145605266094208,
+      "learning_rate": 1.407430330875657e-05,
+      "loss": 0.2486,
+      "step": 4300
+    },
+    {
+      "epoch": 0.5003601076532352,
+      "grad_norm": 0.37989580631256104,
+      "learning_rate": 1.4030683754803873e-05,
+      "loss": 0.2493,
+      "step": 4400
+    },
+    {
+      "epoch": 0.5117319282817179,
+      "grad_norm": 0.31419941782951355,
+      "learning_rate": 1.3986130870138861e-05,
+      "loss": 0.249,
+      "step": 4500
+    },
+    {
+      "epoch": 0.5231037489102005,
+      "grad_norm": 0.34506115317344666,
+      "learning_rate": 1.3940651022023705e-05,
+      "loss": 0.2505,
+      "step": 4600
+    },
+    {
+      "epoch": 0.5344755695386831,
+      "grad_norm": 0.3678961396217346,
+      "learning_rate": 1.3894250710197268e-05,
+      "loss": 0.2478,
+      "step": 4700
+    },
+    {
+      "epoch": 0.5458473901671658,
+      "grad_norm": 0.3394433557987213,
+      "learning_rate": 1.3846936565946217e-05,
+      "loss": 0.2486,
+      "step": 4800
+    },
+    {
+      "epoch": 0.5572192107956484,
+      "grad_norm": 0.3915737569332123,
+      "learning_rate": 1.3798715351157302e-05,
+      "loss": 0.2478,
+      "step": 4900
+    },
+    {
+      "epoch": 0.568591031424131,
+      "grad_norm": 0.34704986214637756,
+      "learning_rate": 1.3749593957350986e-05,
+      "loss": 0.2485,
+      "step": 5000
+    },
+    {
+      "epoch": 0.5799628520526137,
+      "grad_norm": 0.3951033353805542,
+      "learning_rate": 1.369957940469655e-05,
+      "loss": 0.2483,
+      "step": 5100
+    },
+    {
+      "epoch": 0.5913346726810963,
+      "grad_norm": 0.3708275556564331,
+      "learning_rate": 1.3648678841008805e-05,
+      "loss": 0.2455,
+      "step": 5200
+    },
+    {
+      "epoch": 0.6027064933095788,
+      "grad_norm": 0.3068831264972687,
+      "learning_rate": 1.3596899540726558e-05,
+      "loss": 0.2477,
+      "step": 5300
+    },
+    {
+      "epoch": 0.6140783139380614,
+      "grad_norm": 0.3610328137874603,
+      "learning_rate": 1.3544248903872996e-05,
+      "loss": 0.2475,
+      "step": 5400
+    },
+    {
+      "epoch": 0.6254501345665441,
+      "grad_norm": 0.4052928388118744,
+      "learning_rate": 1.3490734454998117e-05,
+      "loss": 0.2457,
+      "step": 5500
+    },
+    {
+      "epoch": 0.6368219551950267,
+      "grad_norm": 0.36340612173080444,
+      "learning_rate": 1.3436363842103345e-05,
+      "loss": 0.2469,
+      "step": 5600
+    },
+    {
+      "epoch": 0.6481937758235093,
+      "grad_norm": 0.33248743414878845,
+      "learning_rate": 1.3381144835548534e-05,
+      "loss": 0.2466,
+      "step": 5700
+    },
+    {
+      "epoch": 0.659565596451992,
+      "grad_norm": 0.32214176654815674,
+      "learning_rate": 1.3325085326941464e-05,
+      "loss": 0.2457,
+      "step": 5800
+    },
+    {
+      "epoch": 0.6709374170804746,
+      "grad_norm": 0.45882582664489746,
+      "learning_rate": 1.3268193328010013e-05,
+      "loss": 0.2436,
+      "step": 5900
+    },
+    {
+      "epoch": 0.6823092377089572,
+      "grad_norm": 0.3200742304325104,
+      "learning_rate": 1.321047696945716e-05,
+      "loss": 0.2459,
+      "step": 6000
+    },
+    {
+      "epoch": 0.6936810583374399,
+      "grad_norm": 0.3391638696193695,
+      "learning_rate": 1.3151944499799003e-05,
+      "loss": 0.2461,
+      "step": 6100
+    },
+    {
+      "epoch": 0.7050528789659224,
+      "grad_norm": 0.34882161021232605,
+      "learning_rate": 1.3092604284185901e-05,
+      "loss": 0.2455,
+      "step": 6200
+    },
+    {
+      "epoch": 0.716424699594405,
+      "grad_norm": 0.31918397545814514,
+      "learning_rate": 1.3032464803206998e-05,
+      "loss": 0.2438,
+      "step": 6300
+    },
+    {
+      "epoch": 0.7277965202228877,
+      "grad_norm": 0.308662474155426,
+      "learning_rate": 1.2971534651678194e-05,
+      "loss": 0.2451,
+      "step": 6400
+    },
+    {
+      "epoch": 0.7391683408513703,
+      "grad_norm": 0.36811548471450806,
+      "learning_rate": 1.2909822537413848e-05,
+      "loss": 0.2448,
+      "step": 6500
+    },
+    {
+      "epoch": 0.7505401614798529,
+      "grad_norm": 0.3129611611366272,
+      "learning_rate": 1.2847337279982274e-05,
+      "loss": 0.2441,
+      "step": 6600
+    },
+    {
+      "epoch": 0.7619119821083356,
+      "grad_norm": 0.31695765256881714,
+      "learning_rate": 1.2784087809445326e-05,
+      "loss": 0.2434,
+      "step": 6700
+    },
+    {
+      "epoch": 0.7732838027368182,
+      "grad_norm": 0.3517361879348755,
+      "learning_rate": 1.2720083165082133e-05,
+      "loss": 0.2444,
+      "step": 6800
+    },
+    {
+      "epoch": 0.7846556233653008,
+      "grad_norm": 0.3190363943576813,
+      "learning_rate": 1.2655332494097267e-05,
+      "loss": 0.2452,
+      "step": 6900
+    },
+    {
+      "epoch": 0.7960274439937834,
+      "grad_norm": 0.3391929864883423,
+      "learning_rate": 1.258984505031348e-05,
+      "loss": 0.243,
+      "step": 7000
+    },
+    {
+      "epoch": 0.8073992646222661,
+      "grad_norm": 0.3427666127681732,
+      "learning_rate": 1.2523630192849175e-05,
+      "loss": 0.2436,
+      "step": 7100
+    },
+    {
+      "epoch": 0.8187710852507486,
+      "grad_norm": 0.2746070921421051,
+      "learning_rate": 1.2456697384780872e-05,
+      "loss": 0.2428,
+      "step": 7200
+    },
+    {
+      "epoch": 0.8301429058792312,
+      "grad_norm": 0.3383966088294983,
+      "learning_rate": 1.2389056191790781e-05,
+      "loss": 0.2431,
+      "step": 7300
+    },
+    {
+      "epoch": 0.8415147265077139,
+      "grad_norm": 0.3772919774055481,
+      "learning_rate": 1.2320716280799739e-05,
+      "loss": 0.2445,
+      "step": 7400
+    },
+    {
+      "epoch": 0.8528865471361965,
+      "grad_norm": 0.3445940911769867,
+      "learning_rate": 1.2251687418585649e-05,
+      "loss": 0.2425,
+      "step": 7500
+    },
+    {
+      "epoch": 0.8642583677646791,
+      "grad_norm": 0.3363341689109802,
+      "learning_rate": 1.2181979470387674e-05,
+      "loss": 0.2424,
+      "step": 7600
+    },
+    {
+      "epoch": 0.8756301883931618,
+      "grad_norm": 0.3137916922569275,
+      "learning_rate": 1.2111602398496347e-05,
+      "loss": 0.2407,
+      "step": 7700
+    },
+    {
+      "epoch": 0.8870020090216444,
+      "grad_norm": 0.34211379289627075,
+      "learning_rate": 1.2040566260829813e-05,
+      "loss": 0.2425,
+      "step": 7800
+    },
+    {
+      "epoch": 0.898373829650127,
+      "grad_norm": 0.38583695888519287,
+      "learning_rate": 1.1968881209496406e-05,
+      "loss": 0.2422,
+      "step": 7900
+    },
+    {
+      "epoch": 0.9097456502786097,
+      "grad_norm": 0.33044618368148804,
+      "learning_rate": 1.189655748934376e-05,
+      "loss": 0.242,
+      "step": 8000
+    },
+    {
+      "epoch": 0.9211174709070922,
+      "grad_norm": 0.34866610169410706,
+      "learning_rate": 1.1823605436494677e-05,
+      "loss": 0.2408,
+      "step": 8100
+    },
+    {
+      "epoch": 0.9324892915355748,
+      "grad_norm": 0.3553544878959656,
+      "learning_rate": 1.175003547686993e-05,
+      "loss": 0.2419,
+      "step": 8200
+    },
+    {
+      "epoch": 0.9438611121640574,
+      "grad_norm": 0.32875463366508484,
+      "learning_rate": 1.1675858124698262e-05,
+      "loss": 0.2413,
+      "step": 8300
+    },
+    {
+      "epoch": 0.9552329327925401,
+      "grad_norm": 0.2906413972377777,
+      "learning_rate": 1.1601083981013732e-05,
+      "loss": 0.243,
+      "step": 8400
+    },
+    {
+      "epoch": 0.9666047534210227,
+      "grad_norm": 0.3169972598552704,
+      "learning_rate": 1.1525723732140687e-05,
+      "loss": 0.24,
+      "step": 8500
+    },
+    {
+      "epoch": 0.9779765740495053,
+      "grad_norm": 0.29645681381225586,
+      "learning_rate": 1.1449788148166514e-05,
+      "loss": 0.2416,
+      "step": 8600
+    },
+    {
+      "epoch": 0.989348394677988,
+      "grad_norm": 0.31245288252830505,
+      "learning_rate": 1.1373288081402454e-05,
+      "loss": 0.2416,
+      "step": 8700
+    },
+    {
+      "epoch": 0.9999241878624768,
+      "eval_loss": 0.240670308470726,
+      "eval_runtime": 8249.7109,
+      "eval_samples_per_second": 15.075,
+      "eval_steps_per_second": 1.371,
+      "step": 8793
+    },
+    {
+      "epoch": 1.0007202153064705,
+      "grad_norm": 0.3536455035209656,
+      "learning_rate": 1.1296234464832622e-05,
+      "loss": 0.2409,
+      "step": 8800
+    },
+    {
+      "epoch": 1.0120920359349532,
+      "grad_norm": 0.3137208819389343,
+      "learning_rate": 1.1218638310551549e-05,
+      "loss": 0.2403,
+      "step": 8900
+    },
+    {
+      "epoch": 1.0234638565634357,
+      "grad_norm": 0.3500869870185852,
+      "learning_rate": 1.1140510708190381e-05,
+      "loss": 0.2403,
+      "step": 9000
+    },
+    {
+      "epoch": 1.0348356771919185,
+      "grad_norm": 0.36828961968421936,
+      "learning_rate": 1.1061862823331999e-05,
+      "loss": 0.2407,
+      "step": 9100
+    },
+    {
+      "epoch": 1.046207497820401,
+      "grad_norm": 0.31094878911972046,
+      "learning_rate": 1.098270589591531e-05,
+      "loss": 0.2382,
+      "step": 9200
+    },
+    {
+      "epoch": 1.0575793184488838,
+      "grad_norm": 0.34195277094841003,
+      "learning_rate": 1.0903051238628875e-05,
+      "loss": 0.2401,
+      "step": 9300
+    },
+    {
+      "epoch": 1.0689511390773663,
+      "grad_norm": 0.3708897531032562,
+      "learning_rate": 1.0822910235294182e-05,
+      "loss": 0.24,
+      "step": 9400
+    },
+    {
+      "epoch": 1.0803229597058488,
+      "grad_norm": 0.3012748062610626,
+      "learning_rate": 1.0742294339238709e-05,
+      "loss": 0.2399,
+      "step": 9500
+    },
+    {
+      "epoch": 1.0916947803343315,
+      "grad_norm": 0.30334606766700745,
+      "learning_rate": 1.0661215071659094e-05,
+      "loss": 0.2393,
+      "step": 9600
+    },
+    {
+      "epoch": 1.103066600962814,
+      "grad_norm": 0.30188488960266113,
+      "learning_rate": 1.0579684019974573e-05,
+      "loss": 0.238,
+      "step": 9700
+    },
+    {
+      "epoch": 1.1144384215912968,
+      "grad_norm": 0.3512589931488037,
+      "learning_rate": 1.0497712836170965e-05,
+      "loss": 0.2387,
+      "step": 9800
+    },
+    {
+      "epoch": 1.1258102422197793,
+      "grad_norm": 0.30085045099258423,
+      "learning_rate": 1.0415313235135456e-05,
+      "loss": 0.2382,
+      "step": 9900
+    },
+    {
+      "epoch": 1.137182062848262,
+      "grad_norm": 0.3223400413990021,
+      "learning_rate": 1.0332496992982332e-05,
+      "loss": 0.2395,
+      "step": 10000
+    },
+    {
+      "epoch": 1.1485538834767446,
+      "grad_norm": 0.3234786093235016,
+      "learning_rate": 1.0249275945370035e-05,
+      "loss": 0.2378,
+      "step": 10100
+    },
+    {
+      "epoch": 1.1599257041052273,
+      "grad_norm": 0.3373458981513977,
+      "learning_rate": 1.0165661985809653e-05,
+      "loss": 0.2375,
+      "step": 10200
+    },
+    {
+      "epoch": 1.1712975247337098,
+      "grad_norm": 0.31131288409233093,
+      "learning_rate": 1.0081667063965164e-05,
+      "loss": 0.2384,
+      "step": 10300
+    },
+    {
+      "epoch": 1.1826693453621924,
+      "grad_norm": 0.28763946890830994,
+      "learning_rate": 9.997303183945664e-06,
+      "loss": 0.2368,
+      "step": 10400
+    },
+    {
+      "epoch": 1.1940411659906751,
+      "grad_norm": 0.34528255462646484,
+      "learning_rate": 9.912582402589786e-06,
+      "loss": 0.2385,
+      "step": 10500
+    },
+    {
+      "epoch": 1.2054129866191576,
+      "grad_norm": 0.36961841583251953,
+      "learning_rate": 9.827516827742623e-06,
+      "loss": 0.2392,
+      "step": 10600
+    },
+    {
+      "epoch": 1.2167848072476404,
+      "grad_norm": 0.3676556348800659,
+      "learning_rate": 9.742118616525315e-06,
+      "loss": 0.2388,
+      "step": 10700
+    },
+    {
+      "epoch": 1.228156627876123,
+      "grad_norm": 0.31305503845214844,
+      "learning_rate": 9.656399973597634e-06,
+      "loss": 0.2386,
+      "step": 10800
+    },
+    {
+      "epoch": 1.2395284485046056,
+      "grad_norm": 0.3261184096336365,
+      "learning_rate": 9.570373149413758e-06,
+      "loss": 0.2388,
+      "step": 10900
+    },
+    {
+      "epoch": 1.2509002691330882,
+      "grad_norm": 0.2724168002605438,
+      "learning_rate": 9.484050438471485e-06,
+      "loss": 0.2382,
+      "step": 11000
+    },
+    {
+      "epoch": 1.262272089761571,
+      "grad_norm": 0.34541308879852295,
+      "learning_rate": 9.397444177555197e-06,
+      "loss": 0.2386,
+      "step": 11100
+    },
+    {
+      "epoch": 1.2736439103900534,
+      "grad_norm": 0.32319512963294983,
+      "learning_rate": 9.31056674397272e-06,
+      "loss": 0.2385,
+      "step": 11200
+    },
+    {
+      "epoch": 1.285015731018536,
+      "grad_norm": 0.3712255656719208,
+      "learning_rate": 9.223430553786452e-06,
+      "loss": 0.2373,
+      "step": 11300
+    },
+    {
+      "epoch": 1.2963875516470187,
+      "grad_norm": 0.32168999314308167,
+      "learning_rate": 9.136048060038903e-06,
+      "loss": 0.2369,
+      "step": 11400
+    },
+    {
+      "epoch": 1.3077593722755014,
+      "grad_norm": 0.37207159399986267,
+      "learning_rate": 9.048431750972995e-06,
+      "loss": 0.2371,
+      "step": 11500
+    },
+    {
+      "epoch": 1.319131192903984,
+      "grad_norm": 0.29025575518608093,
+      "learning_rate": 8.960594148247285e-06,
+      "loss": 0.238,
+      "step": 11600
+    },
+    {
+      "epoch": 1.3305030135324665,
+      "grad_norm": 0.30225130915641785,
+      "learning_rate": 8.872547805146454e-06,
+      "loss": 0.2376,
+      "step": 11700
+    },
+    {
+      "epoch": 1.3418748341609492,
+      "grad_norm": 0.3039323389530182,
+      "learning_rate": 8.784305304787246e-06,
+      "loss": 0.2373,
+      "step": 11800
+    },
+    {
+      "epoch": 1.3532466547894317,
+      "grad_norm": 0.3181616961956024,
+      "learning_rate": 8.695879258320167e-06,
+      "loss": 0.2371,
+      "step": 11900
+    },
+    {
+      "epoch": 1.3646184754179145,
+      "grad_norm": 0.336479514837265,
+      "learning_rate": 8.607282303127153e-06,
+      "loss": 0.2377,
+      "step": 12000
+    }
+  ],
+  "logging_steps": 100,
+  "max_steps": 26379,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 3,
+  "save_steps": 1000,
+  "stateful_callbacks": {
+    "TrainerControl": {
+      "args": {
+        "should_epoch_stop": false,
+        "should_evaluate": false,
+        "should_log": false,
+        "should_save": true,
+        "should_training_stop": false
+      },
+      "attributes": {}
+    }
+  },
+  "total_flos": 6.81271370392535e+18,
+  "train_batch_size": 11,
+  "trial_name": null,
+  "trial_params": null
+}

full_final-tesis/checkpoint-12000/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:58a8dc2e9cd30fabd0bf191d8b86269c1d642655415287312de73d5ceda9dace
+size 5176

full_final-tesis/checkpoint-13000/README.md ADDED Viewed

	@@ -0,0 +1,202 @@

+---
+base_model: mistralai/Mistral-7B-Instruct-v0.3
+library_name: peft
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.12.1.dev0

full_final-tesis/checkpoint-13000/adapter_config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "mistralai/Mistral-7B-Instruct-v0.3",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 16,
+  "lora_dropout": 0.1,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 64,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "q_proj",
+    "v_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM",
+  "use_dora": false,
+  "use_rslora": false
+}

full_final-tesis/checkpoint-13000/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:dbb2b0eae249f0971442ebdf9aa22f6c566f094472bf0b1491499ad26f8c83c2
+size 218138576

full_final-tesis/checkpoint-13000/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6c725e720b70a35e376521d41fda1bb7aeda6f37c6a0c2179062ba74223c3f6b
+size 109570132

full_final-tesis/checkpoint-13000/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:516e63b476efa96b54e07db56778edf2ba2cd1411b0649b49dcd144a9d4c3657
+size 14244

full_final-tesis/checkpoint-13000/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:539027f21f841d593b49040df2c63e157b28746a8018e724e4e9cb9868608754
+size 1064

full_final-tesis/checkpoint-13000/trainer_state.json ADDED Viewed

	@@ -0,0 +1,951 @@

+{
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 1.4783366817027406,
+  "eval_steps": 500,
+  "global_step": 13000,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.01137182062848262,
+      "grad_norm": 0.28454989194869995,
+      "learning_rate": 1.5e-05,
+      "loss": 0.8172,
+      "step": 100
+    },
+    {
+      "epoch": 0.02274364125696524,
+      "grad_norm": 0.3702525198459625,
+      "learning_rate": 1.499946406987345e-05,
+      "loss": 0.3711,
+      "step": 200
+    },
+    {
+      "epoch": 0.03411546188544786,
+      "grad_norm": 0.42058178782463074,
+      "learning_rate": 1.4997856356086094e-05,
+      "loss": 0.3401,
+      "step": 300
+    },
+    {
+      "epoch": 0.04548728251393048,
+      "grad_norm": 0.4536089599132538,
+      "learning_rate": 1.4995177088403865e-05,
+      "loss": 0.3276,
+      "step": 400
+    },
+    {
+      "epoch": 0.0568591031424131,
+      "grad_norm": 0.3354353904724121,
+      "learning_rate": 1.4991426649733503e-05,
+      "loss": 0.3191,
+      "step": 500
+    },
+    {
+      "epoch": 0.06823092377089572,
+      "grad_norm": 0.3974071741104126,
+      "learning_rate": 1.4986605576067824e-05,
+      "loss": 0.3138,
+      "step": 600
+    },
+    {
+      "epoch": 0.07960274439937834,
+      "grad_norm": 0.3995005786418915,
+      "learning_rate": 1.4980714556409132e-05,
+      "loss": 0.3027,
+      "step": 700
+    },
+    {
+      "epoch": 0.09097456502786096,
+      "grad_norm": 0.35224294662475586,
+      "learning_rate": 1.4973754432670731e-05,
+      "loss": 0.2784,
+      "step": 800
+    },
+    {
+      "epoch": 0.10234638565634357,
+      "grad_norm": 0.28719601035118103,
+      "learning_rate": 1.4965726199556621e-05,
+      "loss": 0.2737,
+      "step": 900
+    },
+    {
+      "epoch": 0.1137182062848262,
+      "grad_norm": 0.32742446660995483,
+      "learning_rate": 1.4956631004419335e-05,
+      "loss": 0.2722,
+      "step": 1000
+    },
+    {
+      "epoch": 0.1250900269133088,
+      "grad_norm": 0.44312119483947754,
+      "learning_rate": 1.4946470147095961e-05,
+      "loss": 0.2709,
+      "step": 1100
+    },
+    {
+      "epoch": 0.13646184754179144,
+      "grad_norm": 0.40551960468292236,
+      "learning_rate": 1.4935245079722374e-05,
+      "loss": 0.2683,
+      "step": 1200
+    },
+    {
+      "epoch": 0.14783366817027407,
+      "grad_norm": 0.38514164090156555,
+      "learning_rate": 1.4922957406525721e-05,
+      "loss": 0.2667,
+      "step": 1300
+    },
+    {
+      "epoch": 0.15920548879875668,
+      "grad_norm": 0.4830244481563568,
+      "learning_rate": 1.4909608883595135e-05,
+      "loss": 0.2644,
+      "step": 1400
+    },
+    {
+      "epoch": 0.1705773094272393,
+      "grad_norm": 0.5327072739601135,
+      "learning_rate": 1.489520141863077e-05,
+      "loss": 0.265,
+      "step": 1500
+    },
+    {
+      "epoch": 0.1819491300557219,
+      "grad_norm": 0.3729603588581085,
+      "learning_rate": 1.4879737070671164e-05,
+      "loss": 0.2625,
+      "step": 1600
+    },
+    {
+      "epoch": 0.19332095068420455,
+      "grad_norm": 0.3948141634464264,
+      "learning_rate": 1.4863218049798972e-05,
+      "loss": 0.2632,
+      "step": 1700
+    },
+    {
+      "epoch": 0.20469277131268715,
+      "grad_norm": 0.42538341879844666,
+      "learning_rate": 1.4845646716825118e-05,
+      "loss": 0.2628,
+      "step": 1800
+    },
+    {
+      "epoch": 0.21606459194116978,
+      "grad_norm": 0.39089226722717285,
+      "learning_rate": 1.4827025582951387e-05,
+      "loss": 0.2604,
+      "step": 1900
+    },
+    {
+      "epoch": 0.2274364125696524,
+      "grad_norm": 0.44530946016311646,
+      "learning_rate": 1.4807357309411546e-05,
+      "loss": 0.2595,
+      "step": 2000
+    },
+    {
+      "epoch": 0.23880823319813502,
+      "grad_norm": 0.5049150586128235,
+      "learning_rate": 1.4786644707091018e-05,
+      "loss": 0.2602,
+      "step": 2100
+    },
+    {
+      "epoch": 0.2501800538266176,
+      "grad_norm": 0.4439440369606018,
+      "learning_rate": 1.4764890736125158e-05,
+      "loss": 0.2591,
+      "step": 2200
+    },
+    {
+      "epoch": 0.26155187445510025,
+      "grad_norm": 0.3611554801464081,
+      "learning_rate": 1.4742098505476209e-05,
+      "loss": 0.2577,
+      "step": 2300
+    },
+    {
+      "epoch": 0.2729236950835829,
+      "grad_norm": 0.5148597359657288,
+      "learning_rate": 1.4718271272488986e-05,
+      "loss": 0.2572,
+      "step": 2400
+    },
+    {
+      "epoch": 0.2842955157120655,
+      "grad_norm": 0.4029059410095215,
+      "learning_rate": 1.4693412442425354e-05,
+      "loss": 0.2572,
+      "step": 2500
+    },
+    {
+      "epoch": 0.29566733634054815,
+      "grad_norm": 0.3884616196155548,
+      "learning_rate": 1.4667525567977561e-05,
+      "loss": 0.2555,
+      "step": 2600
+    },
+    {
+      "epoch": 0.3070391569690307,
+      "grad_norm": 0.32239723205566406,
+      "learning_rate": 1.4640614348760517e-05,
+      "loss": 0.2551,
+      "step": 2700
+    },
+    {
+      "epoch": 0.31841097759751336,
+      "grad_norm": 0.36961373686790466,
+      "learning_rate": 1.4612682630783053e-05,
+      "loss": 0.256,
+      "step": 2800
+    },
+    {
+      "epoch": 0.329782798225996,
+      "grad_norm": 0.44517841935157776,
+      "learning_rate": 1.4583734405898277e-05,
+      "loss": 0.2548,
+      "step": 2900
+    },
+    {
+      "epoch": 0.3411546188544786,
+      "grad_norm": 0.36591875553131104,
+      "learning_rate": 1.4553773811233073e-05,
+      "loss": 0.2544,
+      "step": 3000
+    },
+    {
+      "epoch": 0.3525264394829612,
+      "grad_norm": 0.39926204085350037,
+      "learning_rate": 1.4522805128596852e-05,
+      "loss": 0.2551,
+      "step": 3100
+    },
+    {
+      "epoch": 0.3638982601114438,
+      "grad_norm": 0.4806652367115021,
+      "learning_rate": 1.4490832783869617e-05,
+      "loss": 0.2539,
+      "step": 3200
+    },
+    {
+      "epoch": 0.37527008073992646,
+      "grad_norm": 0.37937217950820923,
+      "learning_rate": 1.4457861346369439e-05,
+      "loss": 0.2549,
+      "step": 3300
+    },
+    {
+      "epoch": 0.3866419013684091,
+      "grad_norm": 0.32821956276893616,
+      "learning_rate": 1.4423895528199423e-05,
+      "loss": 0.2531,
+      "step": 3400
+    },
+    {
+      "epoch": 0.3980137219968917,
+      "grad_norm": 0.4085944890975952,
+      "learning_rate": 1.4388940183574303e-05,
+      "loss": 0.2522,
+      "step": 3500
+    },
+    {
+      "epoch": 0.4093855426253743,
+      "grad_norm": 0.42145103216171265,
+      "learning_rate": 1.4353000308126683e-05,
+      "loss": 0.2525,
+      "step": 3600
+    },
+    {
+      "epoch": 0.42075736325385693,
+      "grad_norm": 0.39684563875198364,
+      "learning_rate": 1.4316081038193093e-05,
+      "loss": 0.2512,
+      "step": 3700
+    },
+    {
+      "epoch": 0.43212918388233956,
+      "grad_norm": 0.40946340560913086,
+      "learning_rate": 1.4278187650079938e-05,
+      "loss": 0.2513,
+      "step": 3800
+    },
+    {
+      "epoch": 0.4435010045108222,
+      "grad_norm": 0.36055612564086914,
+      "learning_rate": 1.4239325559309426e-05,
+      "loss": 0.2508,
+      "step": 3900
+    },
+    {
+      "epoch": 0.4548728251393048,
+      "grad_norm": 0.33412662148475647,
+      "learning_rate": 1.4199500319845618e-05,
+      "loss": 0.2521,
+      "step": 4000
+    },
+    {
+      "epoch": 0.4662446457677874,
+      "grad_norm": 0.435330867767334,
+      "learning_rate": 1.415871762330068e-05,
+      "loss": 0.2502,
+      "step": 4100
+    },
+    {
+      "epoch": 0.47761646639627003,
+      "grad_norm": 0.349936306476593,
+      "learning_rate": 1.4116983298121471e-05,
+      "loss": 0.25,
+      "step": 4200
+    },
+    {
+      "epoch": 0.48898828702475267,
+      "grad_norm": 0.4145605266094208,
+      "learning_rate": 1.407430330875657e-05,
+      "loss": 0.2486,
+      "step": 4300
+    },
+    {
+      "epoch": 0.5003601076532352,
+      "grad_norm": 0.37989580631256104,
+      "learning_rate": 1.4030683754803873e-05,
+      "loss": 0.2493,
+      "step": 4400
+    },
+    {
+      "epoch": 0.5117319282817179,
+      "grad_norm": 0.31419941782951355,
+      "learning_rate": 1.3986130870138861e-05,
+      "loss": 0.249,
+      "step": 4500
+    },
+    {
+      "epoch": 0.5231037489102005,
+      "grad_norm": 0.34506115317344666,
+      "learning_rate": 1.3940651022023705e-05,
+      "loss": 0.2505,
+      "step": 4600
+    },
+    {
+      "epoch": 0.5344755695386831,
+      "grad_norm": 0.3678961396217346,
+      "learning_rate": 1.3894250710197268e-05,
+      "loss": 0.2478,
+      "step": 4700
+    },
+    {
+      "epoch": 0.5458473901671658,
+      "grad_norm": 0.3394433557987213,
+      "learning_rate": 1.3846936565946217e-05,
+      "loss": 0.2486,
+      "step": 4800
+    },
+    {
+      "epoch": 0.5572192107956484,
+      "grad_norm": 0.3915737569332123,
+      "learning_rate": 1.3798715351157302e-05,
+      "loss": 0.2478,
+      "step": 4900
+    },
+    {
+      "epoch": 0.568591031424131,
+      "grad_norm": 0.34704986214637756,
+      "learning_rate": 1.3749593957350986e-05,
+      "loss": 0.2485,
+      "step": 5000
+    },
+    {
+      "epoch": 0.5799628520526137,
+      "grad_norm": 0.3951033353805542,
+      "learning_rate": 1.369957940469655e-05,
+      "loss": 0.2483,
+      "step": 5100
+    },
+    {
+      "epoch": 0.5913346726810963,
+      "grad_norm": 0.3708275556564331,
+      "learning_rate": 1.3648678841008805e-05,
+      "loss": 0.2455,
+      "step": 5200
+    },
+    {
+      "epoch": 0.6027064933095788,
+      "grad_norm": 0.3068831264972687,
+      "learning_rate": 1.3596899540726558e-05,
+      "loss": 0.2477,
+      "step": 5300
+    },
+    {
+      "epoch": 0.6140783139380614,
+      "grad_norm": 0.3610328137874603,
+      "learning_rate": 1.3544248903872996e-05,
+      "loss": 0.2475,
+      "step": 5400
+    },
+    {
+      "epoch": 0.6254501345665441,
+      "grad_norm": 0.4052928388118744,
+      "learning_rate": 1.3490734454998117e-05,
+      "loss": 0.2457,
+      "step": 5500
+    },
+    {
+      "epoch": 0.6368219551950267,
+      "grad_norm": 0.36340612173080444,
+      "learning_rate": 1.3436363842103345e-05,
+      "loss": 0.2469,
+      "step": 5600
+    },
+    {
+      "epoch": 0.6481937758235093,
+      "grad_norm": 0.33248743414878845,
+      "learning_rate": 1.3381144835548534e-05,
+      "loss": 0.2466,
+      "step": 5700
+    },
+    {
+      "epoch": 0.659565596451992,
+      "grad_norm": 0.32214176654815674,
+      "learning_rate": 1.3325085326941464e-05,
+      "loss": 0.2457,
+      "step": 5800
+    },
+    {
+      "epoch": 0.6709374170804746,
+      "grad_norm": 0.45882582664489746,
+      "learning_rate": 1.3268193328010013e-05,
+      "loss": 0.2436,
+      "step": 5900
+    },
+    {
+      "epoch": 0.6823092377089572,
+      "grad_norm": 0.3200742304325104,
+      "learning_rate": 1.321047696945716e-05,
+      "loss": 0.2459,
+      "step": 6000
+    },
+    {
+      "epoch": 0.6936810583374399,
+      "grad_norm": 0.3391638696193695,
+      "learning_rate": 1.3151944499799003e-05,
+      "loss": 0.2461,
+      "step": 6100
+    },
+    {
+      "epoch": 0.7050528789659224,
+      "grad_norm": 0.34882161021232605,
+      "learning_rate": 1.3092604284185901e-05,
+      "loss": 0.2455,
+      "step": 6200
+    },
+    {
+      "epoch": 0.716424699594405,
+      "grad_norm": 0.31918397545814514,
+      "learning_rate": 1.3032464803206998e-05,
+      "loss": 0.2438,
+      "step": 6300
+    },
+    {
+      "epoch": 0.7277965202228877,
+      "grad_norm": 0.308662474155426,
+      "learning_rate": 1.2971534651678194e-05,
+      "loss": 0.2451,
+      "step": 6400
+    },
+    {
+      "epoch": 0.7391683408513703,
+      "grad_norm": 0.36811548471450806,
+      "learning_rate": 1.2909822537413848e-05,
+      "loss": 0.2448,
+      "step": 6500
+    },
+    {
+      "epoch": 0.7505401614798529,
+      "grad_norm": 0.3129611611366272,
+      "learning_rate": 1.2847337279982274e-05,
+      "loss": 0.2441,
+      "step": 6600
+    },
+    {
+      "epoch": 0.7619119821083356,
+      "grad_norm": 0.31695765256881714,
+      "learning_rate": 1.2784087809445326e-05,
+      "loss": 0.2434,
+      "step": 6700
+    },
+    {
+      "epoch": 0.7732838027368182,
+      "grad_norm": 0.3517361879348755,
+      "learning_rate": 1.2720083165082133e-05,
+      "loss": 0.2444,
+      "step": 6800
+    },
+    {
+      "epoch": 0.7846556233653008,
+      "grad_norm": 0.3190363943576813,
+      "learning_rate": 1.2655332494097267e-05,
+      "loss": 0.2452,
+      "step": 6900
+    },
+    {
+      "epoch": 0.7960274439937834,
+      "grad_norm": 0.3391929864883423,
+      "learning_rate": 1.258984505031348e-05,
+      "loss": 0.243,
+      "step": 7000
+    },
+    {
+      "epoch": 0.8073992646222661,
+      "grad_norm": 0.3427666127681732,
+      "learning_rate": 1.2523630192849175e-05,
+      "loss": 0.2436,
+      "step": 7100
+    },
+    {
+      "epoch": 0.8187710852507486,
+      "grad_norm": 0.2746070921421051,
+      "learning_rate": 1.2456697384780872e-05,
+      "loss": 0.2428,
+      "step": 7200
+    },
+    {
+      "epoch": 0.8301429058792312,
+      "grad_norm": 0.3383966088294983,
+      "learning_rate": 1.2389056191790781e-05,
+      "loss": 0.2431,
+      "step": 7300
+    },
+    {
+      "epoch": 0.8415147265077139,
+      "grad_norm": 0.3772919774055481,
+      "learning_rate": 1.2320716280799739e-05,
+      "loss": 0.2445,
+      "step": 7400
+    },
+    {
+      "epoch": 0.8528865471361965,
+      "grad_norm": 0.3445940911769867,
+      "learning_rate": 1.2251687418585649e-05,
+      "loss": 0.2425,
+      "step": 7500
+    },
+    {
+      "epoch": 0.8642583677646791,
+      "grad_norm": 0.3363341689109802,
+      "learning_rate": 1.2181979470387674e-05,
+      "loss": 0.2424,
+      "step": 7600
+    },
+    {
+      "epoch": 0.8756301883931618,
+      "grad_norm": 0.3137916922569275,
+      "learning_rate": 1.2111602398496347e-05,
+      "loss": 0.2407,
+      "step": 7700
+    },
+    {
+      "epoch": 0.8870020090216444,
+      "grad_norm": 0.34211379289627075,
+      "learning_rate": 1.2040566260829813e-05,
+      "loss": 0.2425,
+      "step": 7800
+    },
+    {
+      "epoch": 0.898373829650127,
+      "grad_norm": 0.38583695888519287,
+      "learning_rate": 1.1968881209496406e-05,
+      "loss": 0.2422,
+      "step": 7900
+    },
+    {
+      "epoch": 0.9097456502786097,
+      "grad_norm": 0.33044618368148804,
+      "learning_rate": 1.189655748934376e-05,
+      "loss": 0.242,
+      "step": 8000
+    },
+    {
+      "epoch": 0.9211174709070922,
+      "grad_norm": 0.34866610169410706,
+      "learning_rate": 1.1823605436494677e-05,
+      "loss": 0.2408,
+      "step": 8100
+    },
+    {
+      "epoch": 0.9324892915355748,
+      "grad_norm": 0.3553544878959656,
+      "learning_rate": 1.175003547686993e-05,
+      "loss": 0.2419,
+      "step": 8200
+    },
+    {
+      "epoch": 0.9438611121640574,
+      "grad_norm": 0.32875463366508484,
+      "learning_rate": 1.1675858124698262e-05,
+      "loss": 0.2413,
+      "step": 8300
+    },
+    {
+      "epoch": 0.9552329327925401,
+      "grad_norm": 0.2906413972377777,
+      "learning_rate": 1.1601083981013732e-05,
+      "loss": 0.243,
+      "step": 8400
+    },
+    {
+      "epoch": 0.9666047534210227,
+      "grad_norm": 0.3169972598552704,
+      "learning_rate": 1.1525723732140687e-05,
+      "loss": 0.24,
+      "step": 8500
+    },
+    {
+      "epoch": 0.9779765740495053,
+      "grad_norm": 0.29645681381225586,
+      "learning_rate": 1.1449788148166514e-05,
+      "loss": 0.2416,
+      "step": 8600
+    },
+    {
+      "epoch": 0.989348394677988,
+      "grad_norm": 0.31245288252830505,
+      "learning_rate": 1.1373288081402454e-05,
+      "loss": 0.2416,
+      "step": 8700
+    },
+    {
+      "epoch": 0.9999241878624768,
+      "eval_loss": 0.240670308470726,
+      "eval_runtime": 8249.7109,
+      "eval_samples_per_second": 15.075,
+      "eval_steps_per_second": 1.371,
+      "step": 8793
+    },
+    {
+      "epoch": 1.0007202153064705,
+      "grad_norm": 0.3536455035209656,
+      "learning_rate": 1.1296234464832622e-05,
+      "loss": 0.2409,
+      "step": 8800
+    },
+    {
+      "epoch": 1.0120920359349532,
+      "grad_norm": 0.3137208819389343,
+      "learning_rate": 1.1218638310551549e-05,
+      "loss": 0.2403,
+      "step": 8900
+    },
+    {
+      "epoch": 1.0234638565634357,
+      "grad_norm": 0.3500869870185852,
+      "learning_rate": 1.1140510708190381e-05,
+      "loss": 0.2403,
+      "step": 9000
+    },
+    {
+      "epoch": 1.0348356771919185,
+      "grad_norm": 0.36828961968421936,
+      "learning_rate": 1.1061862823331999e-05,
+      "loss": 0.2407,
+      "step": 9100
+    },
+    {
+      "epoch": 1.046207497820401,
+      "grad_norm": 0.31094878911972046,
+      "learning_rate": 1.098270589591531e-05,
+      "loss": 0.2382,
+      "step": 9200
+    },
+    {
+      "epoch": 1.0575793184488838,
+      "grad_norm": 0.34195277094841003,
+      "learning_rate": 1.0903051238628875e-05,
+      "loss": 0.2401,
+      "step": 9300
+    },
+    {
+      "epoch": 1.0689511390773663,
+      "grad_norm": 0.3708897531032562,
+      "learning_rate": 1.0822910235294182e-05,
+      "loss": 0.24,
+      "step": 9400
+    },
+    {
+      "epoch": 1.0803229597058488,
+      "grad_norm": 0.3012748062610626,
+      "learning_rate": 1.0742294339238709e-05,
+      "loss": 0.2399,
+      "step": 9500
+    },
+    {
+      "epoch": 1.0916947803343315,
+      "grad_norm": 0.30334606766700745,
+      "learning_rate": 1.0661215071659094e-05,
+      "loss": 0.2393,
+      "step": 9600
+    },
+    {
+      "epoch": 1.103066600962814,
+      "grad_norm": 0.30188488960266113,
+      "learning_rate": 1.0579684019974573e-05,
+      "loss": 0.238,
+      "step": 9700
+    },
+    {
+      "epoch": 1.1144384215912968,
+      "grad_norm": 0.3512589931488037,
+      "learning_rate": 1.0497712836170965e-05,
+      "loss": 0.2387,
+      "step": 9800
+    },
+    {
+      "epoch": 1.1258102422197793,
+      "grad_norm": 0.30085045099258423,
+      "learning_rate": 1.0415313235135456e-05,
+      "loss": 0.2382,
+      "step": 9900
+    },
+    {
+      "epoch": 1.137182062848262,
+      "grad_norm": 0.3223400413990021,
+      "learning_rate": 1.0332496992982332e-05,
+      "loss": 0.2395,
+      "step": 10000
+    },
+    {
+      "epoch": 1.1485538834767446,
+      "grad_norm": 0.3234786093235016,
+      "learning_rate": 1.0249275945370035e-05,
+      "loss": 0.2378,
+      "step": 10100
+    },
+    {
+      "epoch": 1.1599257041052273,
+      "grad_norm": 0.3373458981513977,
+      "learning_rate": 1.0165661985809653e-05,
+      "loss": 0.2375,
+      "step": 10200
+    },
+    {
+      "epoch": 1.1712975247337098,
+      "grad_norm": 0.31131288409233093,
+      "learning_rate": 1.0081667063965164e-05,
+      "loss": 0.2384,
+      "step": 10300
+    },
+    {
+      "epoch": 1.1826693453621924,
+      "grad_norm": 0.28763946890830994,
+      "learning_rate": 9.997303183945664e-06,
+      "loss": 0.2368,
+      "step": 10400
+    },
+    {
+      "epoch": 1.1940411659906751,
+      "grad_norm": 0.34528255462646484,
+      "learning_rate": 9.912582402589786e-06,
+      "loss": 0.2385,
+      "step": 10500
+    },
+    {
+      "epoch": 1.2054129866191576,
+      "grad_norm": 0.36961841583251953,
+      "learning_rate": 9.827516827742623e-06,
+      "loss": 0.2392,
+      "step": 10600
+    },
+    {
+      "epoch": 1.2167848072476404,
+      "grad_norm": 0.3676556348800659,
+      "learning_rate": 9.742118616525315e-06,
+      "loss": 0.2388,
+      "step": 10700
+    },
+    {
+      "epoch": 1.228156627876123,
+      "grad_norm": 0.31305503845214844,
+      "learning_rate": 9.656399973597634e-06,
+      "loss": 0.2386,
+      "step": 10800
+    },
+    {
+      "epoch": 1.2395284485046056,
+      "grad_norm": 0.3261184096336365,
+      "learning_rate": 9.570373149413758e-06,
+      "loss": 0.2388,
+      "step": 10900
+    },
+    {
+      "epoch": 1.2509002691330882,
+      "grad_norm": 0.2724168002605438,
+      "learning_rate": 9.484050438471485e-06,
+      "loss": 0.2382,
+      "step": 11000
+    },
+    {
+      "epoch": 1.262272089761571,
+      "grad_norm": 0.34541308879852295,
+      "learning_rate": 9.397444177555197e-06,
+      "loss": 0.2386,
+      "step": 11100
+    },
+    {
+      "epoch": 1.2736439103900534,
+      "grad_norm": 0.32319512963294983,
+      "learning_rate": 9.31056674397272e-06,
+      "loss": 0.2385,
+      "step": 11200
+    },
+    {
+      "epoch": 1.285015731018536,
+      "grad_norm": 0.3712255656719208,
+      "learning_rate": 9.223430553786452e-06,
+      "loss": 0.2373,
+      "step": 11300
+    },
+    {
+      "epoch": 1.2963875516470187,
+      "grad_norm": 0.32168999314308167,
+      "learning_rate": 9.136048060038903e-06,
+      "loss": 0.2369,
+      "step": 11400
+    },
+    {
+      "epoch": 1.3077593722755014,
+      "grad_norm": 0.37207159399986267,
+      "learning_rate": 9.048431750972995e-06,
+      "loss": 0.2371,
+      "step": 11500
+    },
+    {
+      "epoch": 1.319131192903984,
+      "grad_norm": 0.29025575518608093,
+      "learning_rate": 8.960594148247285e-06,
+      "loss": 0.238,
+      "step": 11600
+    },
+    {
+      "epoch": 1.3305030135324665,
+      "grad_norm": 0.30225130915641785,
+      "learning_rate": 8.872547805146454e-06,
+      "loss": 0.2376,
+      "step": 11700
+    },
+    {
+      "epoch": 1.3418748341609492,
+      "grad_norm": 0.3039323389530182,
+      "learning_rate": 8.784305304787246e-06,
+      "loss": 0.2373,
+      "step": 11800
+    },
+    {
+      "epoch": 1.3532466547894317,
+      "grad_norm": 0.3181616961956024,
+      "learning_rate": 8.695879258320167e-06,
+      "loss": 0.2371,
+      "step": 11900
+    },
+    {
+      "epoch": 1.3646184754179145,
+      "grad_norm": 0.336479514837265,
+      "learning_rate": 8.607282303127153e-06,
+      "loss": 0.2377,
+      "step": 12000
+    },
+    {
+      "epoch": 1.375990296046397,
+      "grad_norm": 0.3014337420463562,
+      "learning_rate": 8.518527101015515e-06,
+      "loss": 0.2368,
+      "step": 12100
+    },
+    {
+      "epoch": 1.3873621166748795,
+      "grad_norm": 0.3107702136039734,
+      "learning_rate": 8.429626336408372e-06,
+      "loss": 0.2372,
+      "step": 12200
+    },
+    {
+      "epoch": 1.3987339373033623,
+      "grad_norm": 0.3226536214351654,
+      "learning_rate": 8.340592714531864e-06,
+      "loss": 0.2381,
+      "step": 12300
+    },
+    {
+      "epoch": 1.410105757931845,
+      "grad_norm": 0.29249951243400574,
+      "learning_rate": 8.252331049984348e-06,
+      "loss": 0.2373,
+      "step": 12400
+    },
+    {
+      "epoch": 1.4214775785603275,
+      "grad_norm": 0.2941092252731323,
+      "learning_rate": 8.163070914173074e-06,
+      "loss": 0.2349,
+      "step": 12500
+    },
+    {
+      "epoch": 1.43284939918881,
+      "grad_norm": 0.2947429120540619,
+      "learning_rate": 8.073716015780748e-06,
+      "loss": 0.2358,
+      "step": 12600
+    },
+    {
+      "epoch": 1.4442212198172928,
+      "grad_norm": 0.31216520071029663,
+      "learning_rate": 7.984279124935906e-06,
+      "loss": 0.2361,
+      "step": 12700
+    },
+    {
+      "epoch": 1.4555930404457753,
+      "grad_norm": 0.31741130352020264,
+      "learning_rate": 7.894773023485009e-06,
+      "loss": 0.2366,
+      "step": 12800
+    },
+    {
+      "epoch": 1.466964861074258,
+      "grad_norm": 0.3087882995605469,
+      "learning_rate": 7.805210503165726e-06,
+      "loss": 0.2368,
+      "step": 12900
+    },
+    {
+      "epoch": 1.4783366817027406,
+      "grad_norm": 0.3126624524593353,
+      "learning_rate": 7.71560436377882e-06,
+      "loss": 0.2365,
+      "step": 13000
+    }
+  ],
+  "logging_steps": 100,
+  "max_steps": 26379,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 3,
+  "save_steps": 1000,
+  "stateful_callbacks": {
+    "TrainerControl": {
+      "args": {
+        "should_epoch_stop": false,
+        "should_evaluate": false,
+        "should_log": false,
+        "should_save": true,
+        "should_training_stop": false
+      },
+      "attributes": {}
+    }
+  },
+  "total_flos": 7.38044558058455e+18,
+  "train_batch_size": 11,
+  "trial_name": null,
+  "trial_params": null
+}

full_final-tesis/checkpoint-13000/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:58a8dc2e9cd30fabd0bf191d8b86269c1d642655415287312de73d5ceda9dace
+size 5176

full_final-tesis/checkpoint-14000/README.md ADDED Viewed

	@@ -0,0 +1,202 @@

+---
+base_model: mistralai/Mistral-7B-Instruct-v0.3
+library_name: peft
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.12.1.dev0

full_final-tesis/checkpoint-14000/adapter_config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "alpha_pattern": {},
+  "auto_mapping": null,
+  "base_model_name_or_path": "mistralai/Mistral-7B-Instruct-v0.3",
+  "bias": "none",
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 16,
+  "lora_dropout": 0.1,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "r": 64,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "k_proj",
+    "q_proj",
+    "v_proj",
+    "o_proj"
+  ],
+  "task_type": "CAUSAL_LM",
+  "use_dora": false,
+  "use_rslora": false
+}

full_final-tesis/checkpoint-14000/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3a1e359b20da76946ce4376edc0a208bc1fb1b243f6eb848933ee4ff5d6d16ca
+size 218138576