PEFT
Safetensors
English
llama
Roihn commited on
Commit
41a4fe2
·
verified ·
1 Parent(s): 17d7503

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -16,7 +16,7 @@ language:
16
 
17
  ## Model Details
18
 
19
- For all the model fine-tuning, we employ LoRA \citep{hulora} with a rank of 32, training with a global batch size of 128 and a learning rate of 2e-4 using a cosine decay schedule for 1 epoch. Fine-tuning is conducted using [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF), while FlashAttention-2 is used to speed up training. The process takes approximately 30 minutes on 4 A40 GPUs with 48GB RAM each.
20
 
21
  This repo provides the fine-tuned model with full capability of information providing and seeking and chain-of-thought reasoning.
22
 
 
16
 
17
  ## Model Details
18
 
19
+ For all the model fine-tuning, we employ LoRA with a rank of 32, training with a global batch size of 128 and a learning rate of 2e-4 using a cosine decay schedule for 1 epoch. Fine-tuning is conducted using [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF), while FlashAttention-2 is used to speed up training. The process takes approximately 30 minutes on 4 A40 GPUs with 48GB RAM each.
20
 
21
  This repo provides the fine-tuned model with full capability of information providing and seeking and chain-of-thought reasoning.
22