support-ticket-env / train_sft.ipynb

Commit History

Fix SFTConfig: move max_seq_length + dataset_text_field to SFTTrainer (trl API change)
2e81e98

AlgoCore commited on

Add train_sft.ipynb: SFT pre-training with 1000 gold-label examples before GRPO
cf0d796

AlgoCore commited on