blip-solutions
/

SlovAlpaca-lora

Model card Files Files and versions

ju-bezdek commited on Mar 21, 2023

Commit

07693ae

·

1 Parent(s): 1a89142

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ This repository contains the LORA weights finetuned on the translated version of
 ## Training procedure
-The training was done on the 7B LLaMA model (decapoda-research/llama-7b-hf) quantized to 8bits with following Hyperparameters:
 ```
 MICRO_BATCH_SIZE = 3
@@ -20,13 +20,13 @@ BATCH_SIZE = 128
 GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE
 EPOCHS = 2  # paper uses 3
 LEARNING_RATE = 2e-5  # from the original paper
-CUTOFF_LEN = 256  # 256 accounts for about 96% of the data
 LORA_R = 4
 LORA_ALPHA = 16
 LORA_DROPOUT = 0.05
 ```
-The sole goal of this project is to explore the effects of single language finetuning using the same dataset and methods as the original paper did and comapre the results
 @misc{alpaca,
   author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },

 ## Training procedure
+The training was done on the 7B LLaMA model (decapoda-research/llama-7b-hf) quantized to 8bits with the following Hyperparameters:
 ```
 MICRO_BATCH_SIZE = 3
 GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE
 EPOCHS = 2  # paper uses 3
 LEARNING_RATE = 2e-5  # from the original paper
+CUTOFF_LEN = 256
 LORA_R = 4
 LORA_ALPHA = 16
 LORA_DROPOUT = 0.05
 ```
+The sole goal of this project is to explore the effects of single-language finetuning using the same dataset and methods as the original paper did and comapre the results
 @misc{alpaca,
   author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },