boatbomber
/

NabuOCR

Image-Text-to-Text

transliteration

Model card Files Files and versions

boatbomber commited on 25 days ago

Commit

e4df893

·

1 Parent(s): 88a3806

Update readme

Files changed (2) hide show

README.md +7 -3
assets/sft-grad-norm.png +0 -0

README.md CHANGED Viewed

@@ -62,15 +62,19 @@ The images are in color with dimensions between 100px and 2048px, inclusive.
 ### SFT
-TODO: details about fft & loss
 ### GRPO
-TODO: details about rslora & rewards
 ### Story
-For the more detailed story of how this model was trained, see [STORY.md](https://huggingface.co/boatbomber/NabuOCR/blob/main/STORY.md).
 ## Performance

 ### SFT
+For SFT pre-training, the model was trained using full parameter fine-tuning for 2 epochs with a batch size of 2.
+![sft-loss](./assets/sft-loss.png)
 ### GRPO
+For GRPO post-training, the model was trained using Rank Stabilized LoRA (r=256) for 1 epoch with 5 completions per prompt and a batch size of 30, then the adapter was merged back into the base at 16 bit precision.
+![grpo-reward](./assets/grpo-reward.png)
 ### Story
+For the more detailed story of how this model was trained, see [STORY.md](https://huggingface.co/boatbomber/NabuOCR/blob/main/STORY.md). To read the code used for training with the specific hyperparameters and reward functions, see [training/](https://huggingface.co/boatbomber/NabuOCR/blob/main/training).
 ## Performance

assets/sft-grad-norm.png DELETED Viewed

Binary file (67.2 kB)