05/04/2026 09:03:47 - INFO - omnivoice.training.trainer - Loaded Config: TrainingConfig(output_dir='exp/omnivoice_vietnamese_4kh', data_config='/vast/tts/robert/OmniVoice/data/data_config_vietnamese.json', llm_name_or_path='Qwen/Qwen3-0.6B', audio_vocab_size=1025, audio_mask_id=1024, num_audio_codebook=8, audio_codebook_weights=[8, 8, 6, 6, 4, 4, 2, 2], drop_cond_ratio=0.1, prompt_ratio_range=[0.0, 0.3], mask_ratio_range=[0.0, 1.0], language_ratio=0.8, use_pinyin_ratio=0.0, instruct_ratio=0.0, only_instruct_ratio=0.0, resume_from_checkpoint=None, init_from_checkpoint='k2-fsa/OmniVoice', learning_rate=1e-05, weight_decay=0.01, max_grad_norm=1.0, steps=300000, epochs=2, seed=42, lr_scheduler_type='cosine', warmup_type='ratio', warmup_ratio=0.01, warmup_steps=0, batch_tokens=8192, gradient_accumulation_steps=1, num_workers=2, mixed_precision='bf16', allow_tf32=True, use_deepspeed=False, deepspeed_config=None, attn_implementation='flex_attention', max_sample_tokens=2000, min_sample_tokens=50, max_batch_size=64, logging_steps=50, eval_steps=10000, save_steps=10000, keep_last_n_checkpoints=-1) 05/04/2026 09:21:34 - INFO - omnivoice.training.trainer - Loaded Config: TrainingConfig(output_dir='exp/omnivoice_vietnamese_4kh', data_config='/vast/tts/robert/OmniVoice/data/data_config_vietnamese.json', llm_name_or_path='Qwen/Qwen3-0.6B', audio_vocab_size=1025, audio_mask_id=1024, num_audio_codebook=8, audio_codebook_weights=[8, 8, 6, 6, 4, 4, 2, 2], drop_cond_ratio=0.1, prompt_ratio_range=[0.0, 0.3], mask_ratio_range=[0.0, 1.0], language_ratio=0.8, use_pinyin_ratio=0.0, instruct_ratio=0.0, only_instruct_ratio=0.0, resume_from_checkpoint=None, init_from_checkpoint='k2-fsa/OmniVoice', learning_rate=1e-05, weight_decay=0.01, max_grad_norm=1.0, steps=100000, epochs=None, seed=42, lr_scheduler_type='cosine', warmup_type='ratio', warmup_ratio=0.01, warmup_steps=0, batch_tokens=8192, gradient_accumulation_steps=1, num_workers=2, mixed_precision='bf16', allow_tf32=True, use_deepspeed=False, deepspeed_config=None, attn_implementation='flex_attention', max_sample_tokens=2000, min_sample_tokens=50, max_batch_size=64, logging_steps=50, eval_steps=10000, save_steps=10000, keep_last_n_checkpoints=-1) 05/04/2026 09:21:40 - INFO - omnivoice.training.trainer - Starting Training Loop... 05/04/2026 10:07:11 - INFO - omnivoice.training.trainer - Epoch 1 starting. Resetting dataloader... 05/04/2026 10:18:44 - INFO - omnivoice.training.trainer - Running evaluation at step 10000... 05/04/2026 10:18:47 - INFO - omnivoice.training.trainer - Eval Loss: 4.1190 05/04/2026 10:18:47 - INFO - accelerate.accelerator - Saving current state to exp/omnivoice_vietnamese_4kh/checkpoint-10000 05/04/2026 10:18:57 - INFO - accelerate.checkpointing - Model weights saved in exp/omnivoice_vietnamese_4kh/checkpoint-10000/model.safetensors 05/04/2026 10:19:02 - INFO - accelerate.checkpointing - Optimizer state saved in exp/omnivoice_vietnamese_4kh/checkpoint-10000/optimizer.bin 05/04/2026 10:19:02 - INFO - accelerate.checkpointing - Scheduler state saved in exp/omnivoice_vietnamese_4kh/checkpoint-10000/scheduler.bin 05/04/2026 10:19:02 - INFO - accelerate.checkpointing - Random states saved in exp/omnivoice_vietnamese_4kh/checkpoint-10000/random_states_0.pkl 05/04/2026 10:19:05 - INFO - omnivoice.training.checkpoint - Saved checkpoint to exp/omnivoice_vietnamese_4kh/checkpoint-10000 05/04/2026 10:52:54 - INFO - omnivoice.training.trainer - Epoch 2 starting. Resetting dataloader... 05/04/2026 11:15:51 - INFO - omnivoice.training.trainer - Running evaluation at step 20000... 05/04/2026 11:15:53 - INFO - omnivoice.training.trainer - Eval Loss: 4.1431 05/04/2026 11:15:53 - INFO - accelerate.accelerator - Saving current state to exp/omnivoice_vietnamese_4kh/checkpoint-20000 05/04/2026 11:16:02 - INFO - accelerate.checkpointing - Model weights saved in exp/omnivoice_vietnamese_4kh/checkpoint-20000/model.safetensors 05/04/2026 11:16:07 - INFO - accelerate.checkpointing - Optimizer state saved in exp/omnivoice_vietnamese_4kh/checkpoint-20000/optimizer.bin 05/04/2026 11:16:07 - INFO - accelerate.checkpointing - Scheduler state saved in exp/omnivoice_vietnamese_4kh/checkpoint-20000/scheduler.bin 05/04/2026 11:16:07 - INFO - accelerate.checkpointing - Random states saved in exp/omnivoice_vietnamese_4kh/checkpoint-20000/random_states_0.pkl 05/04/2026 11:16:09 - INFO - omnivoice.training.checkpoint - Saved checkpoint to exp/omnivoice_vietnamese_4kh/checkpoint-20000 05/04/2026 11:38:34 - INFO - omnivoice.training.trainer - Epoch 3 starting. Resetting dataloader... 05/04/2026 12:13:10 - INFO - omnivoice.training.trainer - Running evaluation at step 30000... 05/04/2026 12:13:12 - INFO - omnivoice.training.trainer - Eval Loss: 4.0201 05/04/2026 12:13:12 - INFO - accelerate.accelerator - Saving current state to exp/omnivoice_vietnamese_4kh/checkpoint-30000 05/04/2026 12:13:21 - INFO - accelerate.checkpointing - Model weights saved in exp/omnivoice_vietnamese_4kh/checkpoint-30000/model.safetensors 05/04/2026 12:13:25 - INFO - accelerate.checkpointing - Optimizer state saved in exp/omnivoice_vietnamese_4kh/checkpoint-30000/optimizer.bin 05/04/2026 12:13:25 - INFO - accelerate.checkpointing - Scheduler state saved in exp/omnivoice_vietnamese_4kh/checkpoint-30000/scheduler.bin 05/04/2026 12:13:25 - INFO - accelerate.checkpointing - Random states saved in exp/omnivoice_vietnamese_4kh/checkpoint-30000/random_states_0.pkl 05/04/2026 12:13:27 - INFO - omnivoice.training.checkpoint - Saved checkpoint to exp/omnivoice_vietnamese_4kh/checkpoint-30000 05/04/2026 12:24:29 - INFO - omnivoice.training.trainer - Epoch 4 starting. Resetting dataloader... 05/04/2026 13:09:54 - INFO - omnivoice.training.trainer - Epoch 5 starting. Resetting dataloader... 05/04/2026 13:10:37 - INFO - omnivoice.training.trainer - Running evaluation at step 40000... 05/04/2026 13:10:38 - INFO - omnivoice.training.trainer - Eval Loss: 4.0533 05/04/2026 13:10:38 - INFO - accelerate.accelerator - Saving current state to exp/omnivoice_vietnamese_4kh/checkpoint-40000 05/04/2026 13:10:48 - INFO - accelerate.checkpointing - Model weights saved in exp/omnivoice_vietnamese_4kh/checkpoint-40000/model.safetensors 05/04/2026 13:10:52 - INFO - accelerate.checkpointing - Optimizer state saved in exp/omnivoice_vietnamese_4kh/checkpoint-40000/optimizer.bin 05/04/2026 13:10:52 - INFO - accelerate.checkpointing - Scheduler state saved in exp/omnivoice_vietnamese_4kh/checkpoint-40000/scheduler.bin 05/04/2026 13:10:52 - INFO - accelerate.checkpointing - Random states saved in exp/omnivoice_vietnamese_4kh/checkpoint-40000/random_states_0.pkl 05/04/2026 13:10:55 - INFO - omnivoice.training.checkpoint - Saved checkpoint to exp/omnivoice_vietnamese_4kh/checkpoint-40000 05/04/2026 13:55:42 - INFO - omnivoice.training.trainer - Epoch 6 starting. Resetting dataloader... 05/04/2026 14:08:00 - INFO - omnivoice.training.trainer - Running evaluation at step 50000... 05/04/2026 14:08:02 - INFO - omnivoice.training.trainer - Eval Loss: 4.0592 05/04/2026 14:08:02 - INFO - accelerate.accelerator - Saving current state to exp/omnivoice_vietnamese_4kh/checkpoint-50000 05/04/2026 14:08:12 - INFO - accelerate.checkpointing - Model weights saved in exp/omnivoice_vietnamese_4kh/checkpoint-50000/model.safetensors 05/04/2026 14:08:16 - INFO - accelerate.checkpointing - Optimizer state saved in exp/omnivoice_vietnamese_4kh/checkpoint-50000/optimizer.bin 05/04/2026 14:08:16 - INFO - accelerate.checkpointing - Scheduler state saved in exp/omnivoice_vietnamese_4kh/checkpoint-50000/scheduler.bin 05/04/2026 14:08:16 - INFO - accelerate.checkpointing - Random states saved in exp/omnivoice_vietnamese_4kh/checkpoint-50000/random_states_0.pkl 05/04/2026 14:08:18 - INFO - omnivoice.training.checkpoint - Saved checkpoint to exp/omnivoice_vietnamese_4kh/checkpoint-50000 05/04/2026 14:41:31 - INFO - omnivoice.training.trainer - Epoch 7 starting. Resetting dataloader...