2026-01-29 00:32:14,409 - __main__ - INFO - Loading model: internlm/Intern-S1-mini 2026-01-29 00:32:14,409 - __main__ - INFO - Output directory: /vast/home/j/jojolee/therapeutic-tuning/results/sft/tdc_single_token/sft_tdc_single_token_Intern-S1-mini_lr2e-05/2026-01-29_00-32 2026-01-29 00:32:14,409 - __main__ - INFO - Datasets: ['tdc_single_token'] 2026-01-29 00:32:14,945 - __main__ - INFO - Loading model 'internlm/Intern-S1-mini' with attn_implementation='flash_attention_2' 2026-01-29 00:32:15,766 - accelerate.utils.modeling - INFO - We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk). 2026-01-29 00:32:19,180 - __main__ - INFO - Loading dataset 'tdc_single_token' from LoaderRegistry... 2026-01-29 00:32:23,149 - data.loaders.sft.tdc_single_token - INFO - [tdc_single_token] Loaded summary: Tasks processed: 24 Total examples: 151,647 2026-01-29 00:32:43,302 - __main__ - INFO - -> Loaded 151647 examples from 'tdc_single_token' 2026-01-29 00:32:43,945 - __main__ - INFO - Total dataset size: 151647 examples 2026-01-29 00:32:43,953 - __main__ - INFO - Dataset examples written to: /vast/home/j/jojolee/therapeutic-tuning/results/sft/tdc_single_token/sft_tdc_single_token_Intern-S1-mini_lr2e-05/2026-01-29_00-32/dataset_examples.txt 2026-01-29 00:32:43,953 - __main__ - INFO - Training mode: completion_only 2026-01-29 00:32:43,953 - __main__ - INFO - dataset_text_field=None, completion_only_loss=True 2026-01-29 00:40:13,795 - liger_kernel.transformers.monkey_patch - INFO - There are currently no Liger kernels supported for model type: interns1. 2026-01-29 00:40:13,803 - __main__ - INFO - Verifying dataloader integrity... 2026-01-29 00:40:13,805 - __main__ - INFO - # of Batches: 598 2026-01-29 00:40:36,782 - __main__ - INFO - Training batch stats - Avg samples per batch: 253.59, Min: 78, Max: 333 2026-01-29 00:40:36,948 - __main__ - INFO - Batch samples appended to: /vast/home/j/jojolee/therapeutic-tuning/results/sft/tdc_single_token/sft_tdc_single_token_Intern-S1-mini_lr2e-05/2026-01-29_00-32/dataset_examples.txt 2026-01-29 00:40:36,948 - __main__ - INFO - Starting training... 2026-01-29 01:32:05,255 - __main__ - INFO - Pushing model to HuggingFace Hub: jiosephlee/sft_tdc_single_token_Intern-S1-mini_lr2e-05