PumeTu commited on
Commit
32e25aa
·
verified ·
1 Parent(s): fe7a7df

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md CHANGED
@@ -30,6 +30,63 @@ The model was finetuned via [axolotl](https://github.com/axolotl-ai-cloud/axolot
30
  | Epochs | 3 |
31
  | Batch size | 32 |
32
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
 
34
  ## Usage
35
  The model expects a conversation between a patient and a doctor as the input
 
30
  | Epochs | 3 |
31
  | Batch size | 32 |
32
 
33
+ ```bash
34
+ git clone https://github.com/axolotl-ai-cloud/axolotl.git
35
+
36
+ pip3 install -U packaging setuptools wheel ninja
37
+ pip3 install --no-build-isolation axolotl[flash-attn,deepspeed]
38
+
39
+ axolotl train prescreen.yaml
40
+ ```
41
+
42
+ Prescreen.yaml
43
+ ```yaml
44
+ base_model: ThaiLLM/ThaiLLM-27B
45
+ plugins:
46
+ - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin
47
+ strict: false
48
+
49
+ chat_template: gemma3
50
+ datasets:
51
+ - path: prescreen.jsonl
52
+ type: alpaca
53
+ output_dir: ./outputs/ThaiLLM-27B-Prescreen
54
+
55
+ sequence_len: 2048
56
+ sample_packing: true
57
+ ddp_find_unused_parameters: true
58
+
59
+ load_in_4bit: true
60
+ adapter: qlora
61
+ lora_r: 16
62
+ lora_alpha: 32
63
+ lora_target_modules:
64
+ - q_proj
65
+ - k_proj
66
+ - v_proj
67
+ - o_proj
68
+ - down_proj
69
+ - up_proj
70
+ lora_mlp_kernel: true
71
+ lora_qkv_kernel: true
72
+ lora_o_kernel: true
73
+
74
+ gradient_accumulation_steps: 8
75
+ micro_batch_size: 1
76
+ num_epochs: 3
77
+ optimizer: adamw_torch_4bit
78
+ lr_scheduler: cosine
79
+ learning_rate: 2e-4
80
+
81
+ bf16: auto
82
+ tf32: true
83
+
84
+ logging_steps: 1
85
+ flash_attention: true
86
+ warmup_ratio: 0.1
87
+ saves_per_epoch: 3
88
+ weight_decay: 0.01
89
+ ```
90
 
91
  ## Usage
92
  The model expects a conversation between a patient and a doctor as the input