Update README.md
Browse files
README.md
CHANGED
|
@@ -16,7 +16,7 @@ language:
|
|
| 16 |
|
| 17 |
## Model Details
|
| 18 |
|
| 19 |
-
For all the model fine-tuning, we employ LoRA
|
| 20 |
|
| 21 |
This repo provides the fine-tuned model with full capability of information providing and seeking and chain-of-thought reasoning.
|
| 22 |
|
|
|
|
| 16 |
|
| 17 |
## Model Details
|
| 18 |
|
| 19 |
+
For all the model fine-tuning, we employ LoRA with a rank of 32, training with a global batch size of 128 and a learning rate of 2e-4 using a cosine decay schedule for 1 epoch. Fine-tuning is conducted using [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF), while FlashAttention-2 is used to speed up training. The process takes approximately 30 minutes on 4 A40 GPUs with 48GB RAM each.
|
| 20 |
|
| 21 |
This repo provides the fine-tuned model with full capability of information providing and seeking and chain-of-thought reasoning.
|
| 22 |
|