File size: 1,003 Bytes
51e0162 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 | π Starting SFT training...
π Starting SFT training...
π Starting SFT training...
Namespace(model_name='Qwen/Qwen2.5-VL-7B-Instruct', pkl_path='qwen/train.pkl', output_dir='./ckpt_sft_okvqa_aha_int_triangle', batch_size=4, num_workers=0, sft_epochs=2, sft_lr=2e-05, sft_alpha=0.7, kl_beta=0.5, val_ratio=0.02, eval_every=500, eval_samples=5, max_items=0, full_finetune=True, no_lora=False, no_4bit=False, flash_attn=True, trigger_size=30, save_every=0, seed=42, grad_accum_steps=4, mixed_precision='bf16', trigger_shape='triangle')
π Starting SFT training...
[data] total=3335 train=3269 val=66
β Model loaded, trainable params: 8,292,166,656
β Model loaded, trainable params: 8,292,166,656
β Reference model loaded for KL (beta=0.5)
β Model loaded, trainable params: 8,292,166,656
β Model loaded, trainable params: 8,292,166,656
π TensorBoard: tensorboard --logdir=./ckpt_sft_okvqa_aha_int_triangle/logs
πΎ Saving final checkpoint: ./ckpt_sft_okvqa_aha_int_triangle/final_sft
|