YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Granite Wordle SFT (200 steps)
Weights derived from ibm-granite/granite-3.0-1b-a400m-instruct fine-tuned with PRIME-RL's SFT trainer on the willcb/V3-wordle dataset.
- Steps: 200
- Global batch size: 64 (micro batch 1, single H200 GPU)
- Precision: bfloat16
- Peak GPU memory: ~13 GiB
- Final training loss: 0.053 (step 199)
The terminal checkpoint at step_200/ includes tokenizer files and pytorch_model.bin for inference or warm-starting downstream RL experiments.
Training command:
CUDA_VISIBLE_DEVICES=0 uv run sft @ examples/wordle/sft/granite_train.toml \
--output-dir outputs/granite_wordle_sft_200 \
--max-steps 200 \
--weights.interval 10 \
--ckpt.interval 50 \
--ckpt.keep 20
For more details see the PRIME-RL repository.
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support