Reshift / triangle.out
Albert-CAC's picture
Upload folder using huggingface_hub
51e0162 verified
πŸš€ Starting SFT training...
πŸš€ Starting SFT training...
πŸš€ Starting SFT training...
Namespace(model_name='Qwen/Qwen2.5-VL-7B-Instruct', pkl_path='qwen/train.pkl', output_dir='./ckpt_sft_okvqa_aha_int_triangle', batch_size=4, num_workers=0, sft_epochs=2, sft_lr=2e-05, sft_alpha=0.7, kl_beta=0.5, val_ratio=0.02, eval_every=500, eval_samples=5, max_items=0, full_finetune=True, no_lora=False, no_4bit=False, flash_attn=True, trigger_size=30, save_every=0, seed=42, grad_accum_steps=4, mixed_precision='bf16', trigger_shape='triangle')
πŸš€ Starting SFT training...
[data] total=3335 train=3269 val=66
βœ“ Model loaded, trainable params: 8,292,166,656
βœ“ Model loaded, trainable params: 8,292,166,656
βœ“ Reference model loaded for KL (beta=0.5)
βœ“ Model loaded, trainable params: 8,292,166,656
βœ“ Model loaded, trainable params: 8,292,166,656
πŸ“Š TensorBoard: tensorboard --logdir=./ckpt_sft_okvqa_aha_int_triangle/logs
πŸ’Ύ Saving final checkpoint: ./ckpt_sft_okvqa_aha_int_triangle/final_sft