LumiOpen
/

Poro-34B-chat-OpenAssistant

Model card Files Files and versions

Elaine commited on Jan 15, 2025

Commit

93c7c03

·

verified ·

1 Parent(s): c304d85

Update README.md

Files changed (1) hide show

README.md +55 -2

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ This is an SFT-tuned model of [Poro-34B](https://huggingface.co/LumiOpen/Poro-34
 ## Datasets
-**SFT**
 We use a curated subset of Open Assistant 2 and translated the dataset into Finnish using Poro-34B.
@@ -24,14 +24,67 @@ We use a curated subset of Open Assistant 2 and translated the dataset into Finn
 **Finnish OASST2**
 - [instruction-collection-fin](https://huggingface.co/datasets/LumiOpen/instruction-collection-fin) (oasst2 subset)
-**DPO**
 ## Recipes
 **SFT**
 **DPO**
 ## Evaluation

 ## Datasets
+### SFT
 We use a curated subset of Open Assistant 2 and translated the dataset into Finnish using Poro-34B.
 **Finnish OASST2**
 - [instruction-collection-fin](https://huggingface.co/datasets/LumiOpen/instruction-collection-fin) (oasst2 subset)
+### DPO
 ## Recipes
 **SFT**
+```
+bf16: true
+do_eval: true
+evaluation_strategy: epoch
+gradient_accumulation_steps: 2
+gradient_checkpointing: true
+gradient_checkpointing_kwargs:
+  use_reentrant: False
+learning_rate: 2.0e-05
+log_level: info
+logging_steps: 50
+logging_strategy: steps
+lr_scheduler_type: cosine
+max_seq_length: 2048
+max_steps: -1
+num_train_epochs: 3
+output_dir: data/poro-sft-oasst2
+overwrite_output_dir: true
+per_device_eval_batch_size: 4
+per_device_train_batch_size: 2
+remove_unused_columns: true
+save_strategy: "epoch"
+save_total_limit: 1
+seed: 42
+warmup_ratio: 0.1
+```
 **DPO**
+```
+bf16: true
+beta: 0.05
+do_eval: true
+evaluation_strategy: epoch
+gradient_accumulation_steps: 1
+gradient_checkpointing: true
+gradient_checkpointing_kwargs:
+  use_reentrant: False
+learning_rate: 5.0e-7
+log_level: info
+logging_steps: 20
+lr_scheduler_type: cosine
+max_length: 1024
+max_prompt_length: 512
+num_train_epochs: 5
+optim: adamw_torch
+output_dir: data/poro-dpo-helpsteer2
+per_device_train_batch_size: 2
+per_device_eval_batch_size: 4
+save_strategy: "epoch"
+save_total_limit: 1
+seed: 42
+warmup_ratio: 0.1
+```
 ## Evaluation