Roihn
/

Einstein-Puzzles-Model

Model card Files Files and versions

Roihn commited on Oct 27, 2025

Commit

41a4fe2

·

verified ·

1 Parent(s): 17d7503

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ language:
 ## Model Details
-For all the model fine-tuning, we employ LoRA \citep{hulora} with a rank of 32, training with a global batch size of 128 and a learning rate of 2e-4 using a cosine decay schedule for 1 epoch. Fine-tuning is conducted using [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF), while FlashAttention-2 is used to speed up training. The process takes approximately 30 minutes on 4 A40 GPUs with 48GB RAM each.
 This repo provides the fine-tuned model with full capability of information providing and seeking and chain-of-thought reasoning.

 ## Model Details
+For all the model fine-tuning, we employ LoRA with a rank of 32, training with a global batch size of 128 and a learning rate of 2e-4 using a cosine decay schedule for 1 epoch. Fine-tuning is conducted using [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF), while FlashAttention-2 is used to speed up training. The process takes approximately 30 minutes on 4 A40 GPUs with 48GB RAM each.
 This repo provides the fine-tuned model with full capability of information providing and seeking and chain-of-thought reasoning.