ThaiLLM
/

ThaiLLM-27B-Prescreen

Image-Text-to-Text

text-generation-inference

Model card Files Files and versions

PumeTu commited on Jan 19

Commit

32e25aa

·

verified ·

1 Parent(s): fe7a7df

Update README.md

Files changed (1) hide show

README.md +57 -0

README.md CHANGED Viewed

@@ -30,6 +30,63 @@ The model was finetuned via [axolotl](https://github.com/axolotl-ai-cloud/axolot
 | Epochs          | 3     |
 | Batch size      | 32    |
 ## Usage
 The model expects a conversation between a patient and a doctor as the input

 | Epochs          | 3     |
 | Batch size      | 32    |
+```bash
+git clone https://github.com/axolotl-ai-cloud/axolotl.git
+pip3 install -U packaging setuptools wheel ninja
+pip3 install --no-build-isolation axolotl[flash-attn,deepspeed]
+axolotl train prescreen.yaml
+```
+Prescreen.yaml
+```yaml
+base_model: ThaiLLM/ThaiLLM-27B
+plugins:
+  - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin
+strict: false
+chat_template: gemma3
+datasets:
+  - path: prescreen.jsonl
+    type: alpaca
+output_dir: ./outputs/ThaiLLM-27B-Prescreen
+sequence_len: 2048
+sample_packing: true
+ddp_find_unused_parameters: true
+load_in_4bit: true
+adapter: qlora
+lora_r: 16
+lora_alpha: 32
+lora_target_modules:
+  - q_proj
+  - k_proj
+  - v_proj
+  - o_proj
+  - down_proj
+  - up_proj
+lora_mlp_kernel: true
+lora_qkv_kernel: true
+lora_o_kernel: true
+gradient_accumulation_steps: 8
+micro_batch_size: 1
+num_epochs: 3
+optimizer: adamw_torch_4bit
+lr_scheduler: cosine
+learning_rate: 2e-4
+bf16: auto
+tf32: true
+logging_steps: 1
+flash_attention: true
+warmup_ratio: 0.1
+saves_per_epoch: 3
+weight_decay: 0.01
+```
 ## Usage
 The model expects a conversation between a patient and a doctor as the input