dfurman
/

Mistral-7B-Instruct-v0.1

Text Generation

Model card Files Files and versions

dfurman commited on Nov 15, 2023

Commit

5fdebbe

·

1 Parent(s): 434b388

Update README.md

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -213,7 +213,8 @@ See [here](https://github.com/daniel-furman/sft-demos/blob/main/src/sft/mistral/
 The following `TrainingArguments` config was used:
-- num_train_epochs = 1
 - auto_find_batch_size = True
 - gradient_accumulation_steps = 1
 - optim = "paged_adamw_32bit"
@@ -223,6 +224,8 @@ The following `TrainingArguments` config was used:
 - warmup_ratio = 0.03
 - logging_strategy = "steps"
 - logging_steps = 25
 - bf16 = True
 The following `bitsandbytes` quantization config was used:

 The following `TrainingArguments` config was used:
+- output_dir = "./results"
+- num_train_epochs = 3
 - auto_find_batch_size = True
 - gradient_accumulation_steps = 1
 - optim = "paged_adamw_32bit"
 - warmup_ratio = 0.03
 - logging_strategy = "steps"
 - logging_steps = 25
+- evaluation_strategy = "epoch"
+- prediction_loss_only = True
 - bf16 = True
 The following `bitsandbytes` quantization config was used: