kinzakhan1
/

SRD_V7

@@ -1,21 +1,40 @@
 ---
-base_model: unsloth/meta-llama-3.1-8b-instruct-bnb-4bit
 tags:
-- text-generation-inference
-- transformers
 - unsloth
-- llama
-license: apache-2.0
-language:
-- en
 ---
-# Uploaded finetuned  model
-- **Developed by:** kinzakhan1
-- **License:** apache-2.0
-- **Finetuned from model :** unsloth/meta-llama-3.1-8b-instruct-bnb-4bit
-This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
+license: llama3.1
 tags:
+- srd
+- standard-reasoning
+- cot
+- fine-tuned
 - unsloth
+- llama-3.1
+base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
 ---
+# SRD_V7 - Standard Reasoning (SRD) Model (V7)
+## Dataset
+- **Source**: CoT_reasoning_unsloth.jsonl
+- **Examples**: 9,340
+- **Format**: messages[] chat format
+## Training Configuration
+| Parameter | Value |
+|---|---|
+| Learning Rate | 0.00015 |
+| LoRA Rank | 32 |
+| LoRA Alpha | 64 |
+| LoRA Dropout | 0.0 |
+| Target Modules | All (MLP + Attention) |
+| Epochs | 2 |
+| Batch Size (effective) | 16 |
+| Warmup | 3% |
+| RSLoRA | Disabled |
+## Training Results
+- **Training Time**: 1.38 hours
+- **Final Loss**: 1.2049
+## Part of Experiment
+- kinzakhan1/CRD_V7
+- kinzakhan1/SRD_V7 (this model)
+- kinzakhan1/MIXED_V7