kosmylo1992 commited on
Commit
3efb5e2
·
verified ·
1 Parent(s): d437243

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -11
README.md CHANGED
@@ -1,4 +1,4 @@
1
- # Command-R 35B — SFT (Supervised Fine-Tuning)
2
 
3
  **Model type:** Causal Language Model
4
  **Base model:** [CohereLabs/c4ai-command-r-v01](https://huggingface.co/CohereLabs/c4ai-command-r-v01)
@@ -9,9 +9,11 @@
9
 
10
  ## Overview
11
 
12
- `commandr-SFT` is a **supervised fine-tuned** variant of Cohere’s Command-R 35B model.
13
  Fine-tuning was performed on a high-quality instruction-following dataset using LoRA adapters, enabling improved conversational reasoning and question answering.
14
 
 
 
15
  ---
16
 
17
  ## Training Setup
@@ -20,8 +22,9 @@ Fine-tuning was performed on a high-quality instruction-following dataset using
20
  **Adapter type:** LoRA
21
  **Precision:** bfloat16
22
  **Hardware:** 8 nodes × 2 × NVIDIA A100 64GB GPUs
23
- **Training duration:** ~6 hours
24
- **Framework:** DeepSpeed ZeRO-1, Axolotl, PyTorch 2.5.1+cu121
 
25
 
26
  ---
27
 
@@ -29,7 +32,6 @@ Fine-tuning was performed on a high-quality instruction-following dataset using
29
 
30
  **Name:** `axolotl_deduplicated_synthetic_qa.jsonl`
31
  **Type:** Instruction-following synthetic QA dataset
32
- **Split:** 70% train / 30% validation
33
 
34
  Each sample follows a QA/chat format used in the `alpaca_chat.load_qa` schema.
35
 
@@ -40,15 +42,27 @@ Each sample follows a QA/chat format used in the `alpaca_chat.load_qa` schema.
40
  | Parameter | Value |
41
  |------------|-------|
42
  | Sequence length | 2048 |
43
- | Micro batch size | 2 |
44
  | Gradient accumulation | 2 |
45
- | Learning rate | 2e-4 |
 
46
  | LR scheduler | cosine |
47
  | Optimizer | AdamW (8-bit) |
 
 
48
  | LoRA rank (r) | 16 |
49
  | LoRA alpha | 32 |
50
  | LoRA dropout | 0.05 |
51
- | Target modules | q_proj, v_proj, k_proj, o_proj |
52
- | Epochs | 1 |
53
- | Warmup steps | 10 |
54
- | Weight decay | 0.0 |
 
 
 
 
 
 
 
 
 
 
1
+ # Command-R 35B — SFT (Supervised Fine-Tuning on Synthetic QA)
2
 
3
  **Model type:** Causal Language Model
4
  **Base model:** [CohereLabs/c4ai-command-r-v01](https://huggingface.co/CohereLabs/c4ai-command-r-v01)
 
9
 
10
  ## Overview
11
 
12
+ `commandr-35b-sft` is a **supervised fine-tuned** variant of Cohere’s Command-R 35B model.
13
  Fine-tuning was performed on a high-quality instruction-following dataset using LoRA adapters, enabling improved conversational reasoning and question answering.
14
 
15
+ Training was conducted on the **Leonardo EuroHPC** system
16
+
17
  ---
18
 
19
  ## Training Setup
 
22
  **Adapter type:** LoRA
23
  **Precision:** bfloat16
24
  **Hardware:** 8 nodes × 2 × NVIDIA A100 64GB GPUs
25
+ **Framework:** DeepSpeed ZeRO-1, Axolotl, PyTorch 2.5.1+cu121
26
+ **Runtime:** ~6 hours
27
+ **Dataset split:** 70% train / 30% validation
28
 
29
  ---
30
 
 
32
 
33
  **Name:** `axolotl_deduplicated_synthetic_qa.jsonl`
34
  **Type:** Instruction-following synthetic QA dataset
 
35
 
36
  Each sample follows a QA/chat format used in the `alpaca_chat.load_qa` schema.
37
 
 
42
  | Parameter | Value |
43
  |------------|-------|
44
  | Sequence length | 2048 |
45
+ | Micro batch size | 1 |
46
  | Gradient accumulation | 2 |
47
+ | Epochs | 1 |
48
+ | Learning rate | 0.0001 |
49
  | LR scheduler | cosine |
50
  | Optimizer | AdamW (8-bit) |
51
+ | Warmup steps | 20 |
52
+ | Weight decay | 0.0 |
53
  | LoRA rank (r) | 16 |
54
  | LoRA alpha | 32 |
55
  | LoRA dropout | 0.05 |
56
+ | LoRA target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
57
+ | Gradient checkpointing | |
58
+ | Flash attention | |
59
+ | Auto resume | |
60
+ | Loss watchdog threshold | 8.0 |
61
+ | Loss watchdog patience | 20 |
62
+
63
+ ---
64
+
65
+ ## Tokenizer
66
+
67
+ **Tokenizer type:** `AutoTokenizer`
68
+ **Special token:** `<|end_of_text|>` as `pad_token`