kinzakhan1 commited on
Commit
38bbd92
·
verified ·
1 Parent(s): f803506

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +36 -13
README.md CHANGED
@@ -1,21 +1,44 @@
1
  ---
2
- base_model: unsloth/meta-llama-3.1-8b-instruct-bnb-4bit
3
  tags:
4
- - text-generation-inference
5
- - transformers
 
6
  - unsloth
7
- - llama
8
- license: apache-2.0
9
- language:
10
- - en
11
  ---
12
 
13
- # Uploaded finetuned model
14
 
15
- - **Developed by:** kinzakhan1
16
- - **License:** apache-2.0
17
- - **Finetuned from model :** unsloth/meta-llama-3.1-8b-instruct-bnb-4bit
18
 
19
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
 
 
 
 
 
20
 
21
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: llama3.1
3
  tags:
4
+ - reasoning
5
+ - chain-of-thought
6
+ - fine-tuned
7
  - unsloth
8
+ - llama-3.1
9
+ base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
10
+ datasets:
11
+ - custom
12
  ---
13
 
14
+ # SRD_V6 - Standard Reasoning Model (Chain-of-Thought)
15
 
16
+ ## Overview
17
+ Fine-tuned Llama 3.1 8B on Standard Reasoning Dataset (CoT) with adjusted hyperparameters.
 
18
 
19
+ ## Training Details
20
+ - **Base Model**: meta-llama/Meta-Llama-3.1-8B-Instruct
21
+ - **Training Framework**: Unsloth
22
+ - **Dataset**: CoT Reasoning Data (CoT_reasoning_unsloth.jsonl)
23
+ - **Examples**: 9340
24
+ - **Training Time**: 0.33 hours
25
+ - **Final Loss**: 1.9127
26
 
27
+ ## Hyperparameters (Adjusted for SRD)
28
+ - Learning Rate: 2e-05 (2x higher than CRD)
29
+ - Max Steps: 500 (more than CRD)
30
+ - LoRA Rank: 8
31
+ - LoRA Alpha: 16
32
+ - LoRA Dropout: 0.05
33
+ - Warmup: 10%
34
+ - Max Sequence Length: 2048
35
+ - Effective Batch Size: 8
36
+
37
+ ## Notes
38
+ SRD dataset has longer, more complex reasoning chains which results in higher baseline loss.
39
+ Hyperparameters adjusted accordingly.
40
+
41
+ ## Part of Experiment
42
+ - **kinzakhan1/CRD_V6** - Clinical reasoning only
43
+ - **kinzakhan1/SRD_V6** - Standard reasoning only (this model)
44
+ - **kinzakhan1/MIXED_V6** - Mixed dataset