kinzakhan1 commited on
Commit
bd50514
·
verified ·
1 Parent(s): bb6e33e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +32 -13
README.md CHANGED
@@ -1,21 +1,40 @@
1
  ---
2
- base_model: unsloth/meta-llama-3.1-8b-instruct-bnb-4bit
3
  tags:
4
- - text-generation-inference
5
- - transformers
 
 
6
  - unsloth
7
- - llama
8
- license: apache-2.0
9
- language:
10
- - en
11
  ---
12
 
13
- # Uploaded finetuned model
14
 
15
- - **Developed by:** kinzakhan1
16
- - **License:** apache-2.0
17
- - **Finetuned from model :** unsloth/meta-llama-3.1-8b-instruct-bnb-4bit
 
18
 
19
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
1
  ---
2
+ license: llama3.1
3
  tags:
4
+ - srd
5
+ - standard-reasoning
6
+ - cot
7
+ - fine-tuned
8
  - unsloth
9
+ - llama-3.1
10
+ base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
 
 
11
  ---
12
 
13
+ # SRD_V7 - Standard Reasoning (SRD) Model (V7)
14
 
15
+ ## Dataset
16
+ - **Source**: CoT_reasoning_unsloth.jsonl
17
+ - **Examples**: 9,340
18
+ - **Format**: messages[] chat format
19
 
20
+ ## Training Configuration
21
+ | Parameter | Value |
22
+ |---|---|
23
+ | Learning Rate | 0.00015 |
24
+ | LoRA Rank | 32 |
25
+ | LoRA Alpha | 64 |
26
+ | LoRA Dropout | 0.0 |
27
+ | Target Modules | All (MLP + Attention) |
28
+ | Epochs | 2 |
29
+ | Batch Size (effective) | 16 |
30
+ | Warmup | 3% |
31
+ | RSLoRA | Disabled |
32
 
33
+ ## Training Results
34
+ - **Training Time**: 1.38 hours
35
+ - **Final Loss**: 1.2049
36
+
37
+ ## Part of Experiment
38
+ - kinzakhan1/CRD_V7
39
+ - kinzakhan1/SRD_V7 (this model)
40
+ - kinzakhan1/MIXED_V7