YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Granite Wordle SFT (200 steps)

Weights derived from ibm-granite/granite-3.0-1b-a400m-instruct fine-tuned with PRIME-RL's SFT trainer on the willcb/V3-wordle dataset.

  • Steps: 200
  • Global batch size: 64 (micro batch 1, single H200 GPU)
  • Precision: bfloat16
  • Peak GPU memory: ~13 GiB
  • Final training loss: 0.053 (step 199)

The terminal checkpoint at step_200/ includes tokenizer files and pytorch_model.bin for inference or warm-starting downstream RL experiments.

Training command:

CUDA_VISIBLE_DEVICES=0 uv run sft @ examples/wordle/sft/granite_train.toml \
  --output-dir outputs/granite_wordle_sft_200 \
  --max-steps 200 \
  --weights.interval 10 \
  --ckpt.interval 50 \
  --ckpt.keep 20

For more details see the PRIME-RL repository.

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support