tiny-dolma20M / logs /log_20250830_161840.log
ThomasTheMaker's picture
Upload folder using huggingface_hub
d749793 verified
2025-08-30 16:20:45 - pico-train - INFO - Step 0 -- ๐Ÿ“Š Evaluation Results
2025-08-30 16:20:45 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 16:20:45 - pico-train - INFO - ==================================================
2025-08-30 16:20:45 - pico-train - INFO - โœจ Training Configuration
2025-08-30 16:20:45 - pico-train - INFO - ==================================================
2025-08-30 16:20:45 - pico-train - INFO - โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ checkpointing: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ checkpoints_dir: checkpoints โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ evaluation: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ eval_results_dir: eval_results โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ fabric_checkpoint_dir: fabric_state โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ fabric_checkpoint_filename: checkpoint.pt โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ hf_checkpoint: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ collection_slug: null โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ repo_id: ThomasTheMaker/pico-decoder-tiny โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ learning_dynamics: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ batch_size: 1 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ eval_data: null โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ layer_suffixes: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ - attention.v_proj โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ - attention.o_proj โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ - swiglu.w_2 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ sequence_idx: -1 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ learning_dynamics_dir: learning_dynamics โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ logs_dir: logs โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ run_name: pico-decoder-tiny-dolma20M-v1 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ runs_dir: runs โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ save_every_n_steps: 1000 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ save_to_hf: false โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ training: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ auto_resume: true โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ data: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ dataloader: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ batch_size: 16 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ dataset: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ name: ThomasTheMaker/pretokenized-dolma-20M โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ tokenizer: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ name: allenai/OLMo-7B-0724-hf โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ vocab_size: 50304 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ evaluation: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ metrics: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ - paloma โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ paloma: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ batch_size: 1 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ dataset_name: pico-lm/pretokenized-paloma-tinsy โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ dataset_split: val โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ max_length: 2048 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ model: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ activation_hidden_dim: 384 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ attention_n_heads: 12 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ attention_n_kv_heads: 4 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ batch_size: 1024 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ d_model: 96 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ max_seq_len: 2048 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ model_type: pico_decoder โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ n_layers: 12 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ norm_eps: 1.0e-06 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ position_emb_theta: 10000.0 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ vocab_size: 50304 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ monitoring: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ logging: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ log_every_n_steps: 100 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ log_level: INFO โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ save_to_wandb: false โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ wandb: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ entity: boymyc โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ project: pico-decoder-tiny โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ training: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ fabric: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ accelerator: cuda โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ num_devices: 1 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ num_nodes: 1 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ precision: bf16-mixed โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ max_steps: 100000 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ optimization: โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ gradient_accumulation_steps: 1 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ lr: 0.0002 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ lr_scheduler: cosine โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ lr_warmup_steps: 2000 โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ optimizer: adamw โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ”‚ โ”‚
2025-08-30 16:20:45 - pico-train - INFO - โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
2025-08-30 16:20:45 - pico-train - INFO - ==================================================
2025-08-30 16:20:45 - pico-train - INFO - โ›ญ Runtime Summary:
2025-08-30 16:20:45 - pico-train - INFO - ==================================================
2025-08-30 16:20:45 - pico-train - INFO - Starting from step: 0
2025-08-30 16:20:45 - pico-train - INFO - Model Setup:
2025-08-30 16:20:45 - pico-train - INFO - โ””โ”€ Total Parameters: 11,282,784
2025-08-30 16:20:45 - pico-train - INFO - โ””โ”€ Trainable Parameters: 11,282,784
2025-08-30 16:20:45 - pico-train - INFO - Distributed Setup:
2025-08-30 16:20:45 - pico-train - INFO - โ””โ”€ Number of Devices: 1
2025-08-30 16:20:45 - pico-train - INFO - โ””โ”€ Device Type: NVIDIA H100 PCIe
2025-08-30 16:20:45 - pico-train - INFO - โ””โ”€ Available Memory: 84.94 GB
2025-08-30 16:20:45 - pico-train - INFO - Software Setup:
2025-08-30 16:20:45 - pico-train - INFO - โ””โ”€ Python Version: 3.10.12
2025-08-30 16:20:45 - pico-train - INFO - โ””โ”€ PyTorch Version: 2.8.0+cu128
2025-08-30 16:20:45 - pico-train - INFO - โ””โ”€ CUDA Version: 12.8
2025-08-30 16:20:45 - pico-train - INFO - โ””โ”€ Operating System: Linux 6.8.0-40-generic
2025-08-30 16:20:45 - pico-train - INFO - Batch Size Configuration:
2025-08-30 16:20:45 - pico-train - INFO - โ””โ”€ Global Batch Size: 16
2025-08-30 16:20:45 - pico-train - INFO - โ””โ”€ Per Device Batch Size: 16
2025-08-30 16:20:45 - pico-train - INFO - โ””โ”€ Gradient Accumulation Steps: 1
2025-08-30 16:20:45 - pico-train - INFO - ==================================================
2025-08-30 16:20:47 - pico-train - INFO - Step 0 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:20:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 10.9884
2025-08-30 16:20:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 0.00e+00
2025-08-30 16:20:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:20:47 - pico-train - INFO - Step 0 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 16:22:00 - pico-train - INFO - Step 100 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:22:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 10.9746
2025-08-30 16:22:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.00e-05
2025-08-30 16:22:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:23:12 - pico-train - INFO - Step 200 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:23:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 10.7653
2025-08-30 16:23:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-30 16:23:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:24:24 - pico-train - INFO - Step 300 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:24:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 10.2902
2025-08-30 16:24:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.00e-05
2025-08-30 16:24:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:25:35 - pico-train - INFO - Step 400 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:25:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 9.8373
2025-08-30 16:25:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.00e-05
2025-08-30 16:25:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:26:47 - pico-train - INFO - Step 500 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:26:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 9.3629
2025-08-30 16:26:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.00e-05
2025-08-30 16:26:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:27:57 - pico-train - INFO - Step 600 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:27:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 8.8887
2025-08-30 16:27:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.00e-05
2025-08-30 16:27:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:29:09 - pico-train - INFO - Step 700 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:29:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 8.4407
2025-08-30 16:29:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.00e-05
2025-08-30 16:29:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:30:21 - pico-train - INFO - Step 800 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:30:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 8.0906
2025-08-30 16:30:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.00e-05
2025-08-30 16:30:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:31:32 - pico-train - INFO - Step 900 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:31:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.8459
2025-08-30 16:31:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.00e-05
2025-08-30 16:31:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:32:42 - pico-train - INFO - Step 1000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 16:34:45 - pico-train - INFO - Step 1000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 16:34:45 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.4389863020037474e+19
2025-08-30 16:34:47 - pico-train - INFO - Step 1000 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:34:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.6972
2025-08-30 16:34:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.00e-04
2025-08-30 16:34:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:34:47 - pico-train - INFO - Step 1000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 16:36:00 - pico-train - INFO - Step 1100 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:36:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.5571
2025-08-30 16:36:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.10e-04
2025-08-30 16:36:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:37:11 - pico-train - INFO - Step 1200 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:37:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.4823
2025-08-30 16:37:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.20e-04
2025-08-30 16:37:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:38:24 - pico-train - INFO - Step 1300 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:38:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.3624
2025-08-30 16:38:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.30e-04
2025-08-30 16:38:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:39:34 - pico-train - INFO - Step 1400 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:39:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.2538
2025-08-30 16:39:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-04
2025-08-30 16:39:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:40:45 - pico-train - INFO - Step 1500 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:40:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.1582
2025-08-30 16:40:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-04
2025-08-30 16:40:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:41:56 - pico-train - INFO - Step 1600 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:41:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 7.0463
2025-08-30 16:41:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-04
2025-08-30 16:41:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:43:08 - pico-train - INFO - Step 1700 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:43:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.9729
2025-08-30 16:43:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-04
2025-08-30 16:43:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:44:19 - pico-train - INFO - Step 1800 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:44:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.8826
2025-08-30 16:44:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-04
2025-08-30 16:44:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:45:30 - pico-train - INFO - Step 1900 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:45:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.8004
2025-08-30 16:45:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 16:45:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:46:40 - pico-train - INFO - Step 2000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 16:48:43 - pico-train - INFO - Step 2000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 16:48:43 - pico-train - INFO - โ””โ”€โ”€ paloma: 9.921210281541321e+20
2025-08-30 16:48:43 - pico-train - INFO - Step 2000 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:48:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.7360
2025-08-30 16:48:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 16:48:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:48:43 - pico-train - INFO - Step 2000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 16:49:57 - pico-train - INFO - Step 2100 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:49:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.6659
2025-08-30 16:49:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 16:49:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:51:08 - pico-train - INFO - Step 2200 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:51:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.6041
2025-08-30 16:51:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 16:51:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:52:20 - pico-train - INFO - Step 2300 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:52:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.5361
2025-08-30 16:52:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 16:52:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:53:30 - pico-train - INFO - Step 2400 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:53:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.5012
2025-08-30 16:53:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 16:53:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:54:41 - pico-train - INFO - Step 2500 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:54:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.4542
2025-08-30 16:54:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 16:54:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:55:53 - pico-train - INFO - Step 2600 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:55:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.4300
2025-08-30 16:55:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 16:55:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:57:04 - pico-train - INFO - Step 2700 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:57:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.3678
2025-08-30 16:57:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 16:57:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:58:15 - pico-train - INFO - Step 2800 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:58:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.3538
2025-08-30 16:58:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 16:58:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 16:59:26 - pico-train - INFO - Step 2900 -- ๐Ÿ”„ Training Metrics
2025-08-30 16:59:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.3226
2025-08-30 16:59:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 16:59:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:00:37 - pico-train - INFO - Step 3000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 17:02:39 - pico-train - INFO - Step 3000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 17:02:39 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.410569515309224e+22
2025-08-30 17:02:39 - pico-train - INFO - Step 3000 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:02:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.2808
2025-08-30 17:02:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:02:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:02:39 - pico-train - INFO - Step 3000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 17:03:54 - pico-train - INFO - Step 3100 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:03:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.2628
2025-08-30 17:03:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:03:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:05:05 - pico-train - INFO - Step 3200 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:05:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.2207
2025-08-30 17:05:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:05:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:06:15 - pico-train - INFO - Step 3300 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:06:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.2142
2025-08-30 17:06:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:06:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:07:26 - pico-train - INFO - Step 3400 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:07:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1526
2025-08-30 17:07:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:07:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:08:37 - pico-train - INFO - Step 3500 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:08:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1105
2025-08-30 17:08:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:08:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:09:49 - pico-train - INFO - Step 3600 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:09:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1327
2025-08-30 17:09:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:09:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:11:00 - pico-train - INFO - Step 3700 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:11:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1047
2025-08-30 17:11:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:11:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:12:11 - pico-train - INFO - Step 3800 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:12:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0910
2025-08-30 17:12:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:12:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:13:23 - pico-train - INFO - Step 3900 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:13:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0370
2025-08-30 17:13:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:13:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:14:33 - pico-train - INFO - Step 4000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 17:16:35 - pico-train - INFO - Step 4000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 17:16:35 - pico-train - INFO - โ””โ”€โ”€ paloma: 3.516532942334589e+23
2025-08-30 17:16:35 - pico-train - INFO - Step 4000 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:16:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0450
2025-08-30 17:16:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:16:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:16:35 - pico-train - INFO - Step 4000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 17:17:49 - pico-train - INFO - Step 4100 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:17:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0121
2025-08-30 17:17:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:17:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:19:00 - pico-train - INFO - Step 4200 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:19:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9897
2025-08-30 17:19:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:19:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:20:11 - pico-train - INFO - Step 4300 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:20:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9636
2025-08-30 17:20:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:20:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:21:22 - pico-train - INFO - Step 4400 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:21:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9760
2025-08-30 17:21:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:21:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:22:33 - pico-train - INFO - Step 4500 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:22:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9551
2025-08-30 17:22:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:22:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:23:45 - pico-train - INFO - Step 4600 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:23:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9166
2025-08-30 17:23:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:23:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:24:55 - pico-train - INFO - Step 4700 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:24:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8940
2025-08-30 17:24:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:24:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:26:07 - pico-train - INFO - Step 4800 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:26:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8770
2025-08-30 17:26:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:26:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:27:19 - pico-train - INFO - Step 4900 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:27:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8605
2025-08-30 17:27:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:27:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:28:29 - pico-train - INFO - Step 5000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 17:30:32 - pico-train - INFO - Step 5000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 17:30:32 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.1285856883470545e+25
2025-08-30 17:30:32 - pico-train - INFO - Step 5000 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:30:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8727
2025-08-30 17:30:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:30:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:30:32 - pico-train - INFO - Step 5000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 17:31:46 - pico-train - INFO - Step 5100 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:31:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8447
2025-08-30 17:31:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-04
2025-08-30 17:31:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:32:58 - pico-train - INFO - Step 5200 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:32:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8290
2025-08-30 17:32:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:32:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:34:09 - pico-train - INFO - Step 5300 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:34:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8219
2025-08-30 17:34:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:34:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:35:20 - pico-train - INFO - Step 5400 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:35:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8133
2025-08-30 17:35:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:35:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:36:31 - pico-train - INFO - Step 5500 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:36:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7878
2025-08-30 17:36:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:36:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:37:42 - pico-train - INFO - Step 5600 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:37:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7639
2025-08-30 17:37:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:37:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:38:53 - pico-train - INFO - Step 5700 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:38:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7699
2025-08-30 17:38:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:38:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:40:04 - pico-train - INFO - Step 5800 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:40:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7458
2025-08-30 17:40:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:40:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:41:15 - pico-train - INFO - Step 5900 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:41:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7482
2025-08-30 17:41:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:41:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:42:26 - pico-train - INFO - Step 6000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 17:44:27 - pico-train - INFO - Step 6000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 17:44:27 - pico-train - INFO - โ””โ”€โ”€ paloma: 5.5635981050246924e+26
2025-08-30 17:44:28 - pico-train - INFO - Step 6000 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:44:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7402
2025-08-30 17:44:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:44:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:44:28 - pico-train - INFO - Step 6000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 17:45:41 - pico-train - INFO - Step 6100 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:45:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7377
2025-08-30 17:45:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:45:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:46:53 - pico-train - INFO - Step 6200 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:46:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6951
2025-08-30 17:46:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:46:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:48:04 - pico-train - INFO - Step 6300 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:48:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6844
2025-08-30 17:48:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:48:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:49:17 - pico-train - INFO - Step 6400 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:49:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6903
2025-08-30 17:49:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:49:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:50:27 - pico-train - INFO - Step 6500 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:50:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6878
2025-08-30 17:50:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:50:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:51:38 - pico-train - INFO - Step 6600 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:51:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6539
2025-08-30 17:51:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:51:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:52:49 - pico-train - INFO - Step 6700 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:52:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6436
2025-08-30 17:52:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:52:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:54:01 - pico-train - INFO - Step 6800 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:54:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6443
2025-08-30 17:54:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:54:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:55:12 - pico-train - INFO - Step 6900 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:55:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6237
2025-08-30 17:55:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:55:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:56:22 - pico-train - INFO - Step 7000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 17:58:23 - pico-train - INFO - Step 7000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 17:58:23 - pico-train - INFO - โ””โ”€โ”€ paloma: 5.625059129813264e+28
2025-08-30 17:58:24 - pico-train - INFO - Step 7000 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:58:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6189
2025-08-30 17:58:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:58:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 17:58:24 - pico-train - INFO - Step 7000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 17:59:37 - pico-train - INFO - Step 7100 -- ๐Ÿ”„ Training Metrics
2025-08-30 17:59:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5781
2025-08-30 17:59:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 17:59:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:00:48 - pico-train - INFO - Step 7200 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:00:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6113
2025-08-30 18:00:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 18:00:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:01:59 - pico-train - INFO - Step 7300 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:01:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5985
2025-08-30 18:01:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 18:01:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:03:11 - pico-train - INFO - Step 7400 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:03:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6009
2025-08-30 18:03:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-04
2025-08-30 18:03:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:04:21 - pico-train - INFO - Step 7500 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:04:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5714
2025-08-30 18:04:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:04:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:05:32 - pico-train - INFO - Step 7600 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:05:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5714
2025-08-30 18:05:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:05:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:06:44 - pico-train - INFO - Step 7700 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:06:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5653
2025-08-30 18:06:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:06:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:07:55 - pico-train - INFO - Step 7800 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:07:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5558
2025-08-30 18:07:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:07:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:09:06 - pico-train - INFO - Step 7900 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:09:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5568
2025-08-30 18:09:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:09:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:10:16 - pico-train - INFO - Step 8000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 18:12:18 - pico-train - INFO - Step 8000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 18:12:18 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.8189708405471985e+30
2025-08-30 18:12:19 - pico-train - INFO - Step 8000 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:12:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5268
2025-08-30 18:12:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:12:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:12:19 - pico-train - INFO - Step 8000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 18:13:32 - pico-train - INFO - Step 8100 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:13:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5294
2025-08-30 18:13:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:13:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:14:43 - pico-train - INFO - Step 8200 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:14:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5350
2025-08-30 18:14:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:14:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:15:54 - pico-train - INFO - Step 8300 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:15:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5279
2025-08-30 18:15:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:15:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:17:05 - pico-train - INFO - Step 8400 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:17:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4997
2025-08-30 18:17:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:17:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:18:16 - pico-train - INFO - Step 8500 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:18:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4873
2025-08-30 18:18:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:18:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:19:27 - pico-train - INFO - Step 8600 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:19:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.5047
2025-08-30 18:19:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:19:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:20:38 - pico-train - INFO - Step 8700 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:20:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4809
2025-08-30 18:20:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:20:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:21:49 - pico-train - INFO - Step 8800 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:21:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4858
2025-08-30 18:21:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:21:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:23:00 - pico-train - INFO - Step 8900 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:23:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4689
2025-08-30 18:23:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-04
2025-08-30 18:23:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:24:12 - pico-train - INFO - Step 9000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 18:26:13 - pico-train - INFO - Step 9000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 18:26:13 - pico-train - INFO - โ””โ”€โ”€ paloma: 3.5931189083569604e+33
2025-08-30 18:26:14 - pico-train - INFO - Step 9000 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:26:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4476
2025-08-30 18:26:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 18:26:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:26:14 - pico-train - INFO - Step 9000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 18:27:28 - pico-train - INFO - Step 9100 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:27:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4383
2025-08-30 18:27:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 18:27:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:28:38 - pico-train - INFO - Step 9200 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:28:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4568
2025-08-30 18:28:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 18:28:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:29:50 - pico-train - INFO - Step 9300 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:29:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4368
2025-08-30 18:29:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 18:29:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:31:01 - pico-train - INFO - Step 9400 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:31:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4623
2025-08-30 18:31:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 18:31:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:32:12 - pico-train - INFO - Step 9500 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:32:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4268
2025-08-30 18:32:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 18:32:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:33:24 - pico-train - INFO - Step 9600 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:33:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4640
2025-08-30 18:33:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 18:33:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:34:34 - pico-train - INFO - Step 9700 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:34:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4520
2025-08-30 18:34:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 18:34:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:35:45 - pico-train - INFO - Step 9800 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:35:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4139
2025-08-30 18:35:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 18:35:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:36:56 - pico-train - INFO - Step 9900 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:36:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4025
2025-08-30 18:36:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 18:36:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:38:07 - pico-train - INFO - Step 10000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 18:40:09 - pico-train - INFO - Step 10000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 18:40:09 - pico-train - INFO - โ””โ”€โ”€ paloma: 9.887912407124033e+34
2025-08-30 18:40:10 - pico-train - INFO - Step 10000 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:40:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4191
2025-08-30 18:40:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 18:40:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:40:10 - pico-train - INFO - Step 10000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 18:41:23 - pico-train - INFO - Step 10100 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:41:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3756
2025-08-30 18:41:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 18:41:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:42:34 - pico-train - INFO - Step 10200 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:42:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3976
2025-08-30 18:42:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-04
2025-08-30 18:42:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:43:46 - pico-train - INFO - Step 10300 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:43:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4048
2025-08-30 18:43:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 18:43:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:44:57 - pico-train - INFO - Step 10400 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:44:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3990
2025-08-30 18:44:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 18:44:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:46:08 - pico-train - INFO - Step 10500 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:46:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.4016
2025-08-30 18:46:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 18:46:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:47:19 - pico-train - INFO - Step 10600 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:47:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3924
2025-08-30 18:47:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 18:47:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:48:30 - pico-train - INFO - Step 10700 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:48:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3780
2025-08-30 18:48:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 18:48:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:49:41 - pico-train - INFO - Step 10800 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:49:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3432
2025-08-30 18:49:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 18:49:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:50:52 - pico-train - INFO - Step 10900 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:50:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3610
2025-08-30 18:50:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 18:50:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:52:03 - pico-train - INFO - Step 11000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 18:54:05 - pico-train - INFO - Step 11000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 18:54:05 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 18:54:06 - pico-train - INFO - Step 11000 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:54:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3561
2025-08-30 18:54:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 18:54:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:54:06 - pico-train - INFO - Step 11000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 18:55:19 - pico-train - INFO - Step 11100 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:55:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3818
2025-08-30 18:55:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 18:55:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:56:30 - pico-train - INFO - Step 11200 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:56:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3595
2025-08-30 18:56:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 18:56:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:57:42 - pico-train - INFO - Step 11300 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:57:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3557
2025-08-30 18:57:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-04
2025-08-30 18:57:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 18:58:53 - pico-train - INFO - Step 11400 -- ๐Ÿ”„ Training Metrics
2025-08-30 18:58:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3483
2025-08-30 18:58:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 18:58:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:00:04 - pico-train - INFO - Step 11500 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:00:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3398
2025-08-30 19:00:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 19:00:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:01:15 - pico-train - INFO - Step 11600 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:01:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3252
2025-08-30 19:01:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 19:01:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:02:27 - pico-train - INFO - Step 11700 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:02:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3194
2025-08-30 19:02:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 19:02:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:03:38 - pico-train - INFO - Step 11800 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:03:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3386
2025-08-30 19:03:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 19:03:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:04:49 - pico-train - INFO - Step 11900 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:04:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3458
2025-08-30 19:04:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 19:04:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:05:59 - pico-train - INFO - Step 12000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 19:08:01 - pico-train - INFO - Step 12000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 19:08:01 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 19:08:01 - pico-train - INFO - Step 12000 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:08:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3216
2025-08-30 19:08:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 19:08:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:08:01 - pico-train - INFO - Step 12000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 19:09:15 - pico-train - INFO - Step 12100 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:09:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3164
2025-08-30 19:09:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 19:09:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:10:26 - pico-train - INFO - Step 12200 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:10:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3075
2025-08-30 19:10:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 19:10:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:11:38 - pico-train - INFO - Step 12300 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:11:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2907
2025-08-30 19:11:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-04
2025-08-30 19:11:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:12:49 - pico-train - INFO - Step 12400 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:12:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2970
2025-08-30 19:12:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 19:12:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:14:00 - pico-train - INFO - Step 12500 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:14:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2849
2025-08-30 19:14:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 19:14:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:15:12 - pico-train - INFO - Step 12600 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:15:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3026
2025-08-30 19:15:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 19:15:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:16:23 - pico-train - INFO - Step 12700 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:16:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2792
2025-08-30 19:16:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 19:16:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:17:35 - pico-train - INFO - Step 12800 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:17:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3026
2025-08-30 19:17:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 19:17:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:18:46 - pico-train - INFO - Step 12900 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:18:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2918
2025-08-30 19:18:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 19:18:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:19:56 - pico-train - INFO - Step 13000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 19:21:58 - pico-train - INFO - Step 13000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 19:21:58 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 19:21:59 - pico-train - INFO - Step 13000 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:21:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.3032
2025-08-30 19:21:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 19:21:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:21:59 - pico-train - INFO - Step 13000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 19:23:13 - pico-train - INFO - Step 13100 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:23:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2888
2025-08-30 19:23:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 19:23:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:24:24 - pico-train - INFO - Step 13200 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:24:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2853
2025-08-30 19:24:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 19:24:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:25:35 - pico-train - INFO - Step 13300 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:25:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2828
2025-08-30 19:25:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-04
2025-08-30 19:25:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:26:46 - pico-train - INFO - Step 13400 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:26:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2770
2025-08-30 19:26:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 19:26:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:27:57 - pico-train - INFO - Step 13500 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:27:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2645
2025-08-30 19:27:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 19:27:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:29:08 - pico-train - INFO - Step 13600 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:29:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2692
2025-08-30 19:29:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 19:29:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:30:19 - pico-train - INFO - Step 13700 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:30:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2500
2025-08-30 19:30:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 19:30:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:31:31 - pico-train - INFO - Step 13800 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:31:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2655
2025-08-30 19:31:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 19:31:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:32:41 - pico-train - INFO - Step 13900 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:32:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2590
2025-08-30 19:32:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 19:32:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:33:52 - pico-train - INFO - Step 14000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 19:35:59 - pico-train - INFO - Step 14000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 19:35:59 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 19:36:00 - pico-train - INFO - Step 14000 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:36:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2630
2025-08-30 19:36:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 19:36:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:36:00 - pico-train - INFO - Step 14000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 19:37:15 - pico-train - INFO - Step 14100 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:37:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2382
2025-08-30 19:37:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-04
2025-08-30 19:37:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:38:25 - pico-train - INFO - Step 14200 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:38:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2504
2025-08-30 19:38:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 19:38:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:39:37 - pico-train - INFO - Step 14300 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:39:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2443
2025-08-30 19:39:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 19:39:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:40:48 - pico-train - INFO - Step 14400 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:40:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2409
2025-08-30 19:40:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 19:40:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:41:59 - pico-train - INFO - Step 14500 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:41:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2553
2025-08-30 19:41:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 19:41:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:43:10 - pico-train - INFO - Step 14600 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:43:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2326
2025-08-30 19:43:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 19:43:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:44:21 - pico-train - INFO - Step 14700 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:44:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2397
2025-08-30 19:44:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 19:44:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:45:33 - pico-train - INFO - Step 14800 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:45:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2355
2025-08-30 19:45:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 19:45:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:46:44 - pico-train - INFO - Step 14900 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:46:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2328
2025-08-30 19:46:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-04
2025-08-30 19:46:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:47:54 - pico-train - INFO - Step 15000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 19:49:56 - pico-train - INFO - Step 15000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 19:49:56 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 19:49:56 - pico-train - INFO - Step 15000 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:49:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2308
2025-08-30 19:49:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 19:49:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:49:56 - pico-train - INFO - Step 15000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 19:51:11 - pico-train - INFO - Step 15100 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:51:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2283
2025-08-30 19:51:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 19:51:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:52:21 - pico-train - INFO - Step 15200 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:52:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2300
2025-08-30 19:52:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 19:52:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:53:32 - pico-train - INFO - Step 15300 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:53:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2523
2025-08-30 19:53:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 19:53:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:54:44 - pico-train - INFO - Step 15400 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:54:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2163
2025-08-30 19:54:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 19:54:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:55:55 - pico-train - INFO - Step 15500 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:55:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2331
2025-08-30 19:55:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 19:55:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:57:06 - pico-train - INFO - Step 15600 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:57:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2262
2025-08-30 19:57:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 19:57:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:58:17 - pico-train - INFO - Step 15700 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:58:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2087
2025-08-30 19:58:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-04
2025-08-30 19:58:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 19:59:28 - pico-train - INFO - Step 15800 -- ๐Ÿ”„ Training Metrics
2025-08-30 19:59:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2198
2025-08-30 19:59:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 19:59:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:00:40 - pico-train - INFO - Step 15900 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:00:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2056
2025-08-30 20:00:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 20:00:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:01:50 - pico-train - INFO - Step 16000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 20:03:52 - pico-train - INFO - Step 16000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 20:03:52 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 20:03:53 - pico-train - INFO - Step 16000 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:03:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2087
2025-08-30 20:03:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 20:03:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:03:53 - pico-train - INFO - Step 16000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 20:05:06 - pico-train - INFO - Step 16100 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:05:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1931
2025-08-30 20:05:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 20:05:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:06:18 - pico-train - INFO - Step 16200 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:06:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1773
2025-08-30 20:06:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 20:06:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:07:29 - pico-train - INFO - Step 16300 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:07:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2031
2025-08-30 20:07:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 20:07:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:08:40 - pico-train - INFO - Step 16400 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:08:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1868
2025-08-30 20:08:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-04
2025-08-30 20:08:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:09:52 - pico-train - INFO - Step 16500 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:09:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1763
2025-08-30 20:09:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-04
2025-08-30 20:09:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:11:02 - pico-train - INFO - Step 16600 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:11:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2021
2025-08-30 20:11:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-04
2025-08-30 20:11:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:12:15 - pico-train - INFO - Step 16700 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:12:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1796
2025-08-30 20:12:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-04
2025-08-30 20:12:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:13:26 - pico-train - INFO - Step 16800 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:13:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1488
2025-08-30 20:13:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-04
2025-08-30 20:13:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:14:37 - pico-train - INFO - Step 16900 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:14:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1805
2025-08-30 20:14:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-04
2025-08-30 20:14:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:15:47 - pico-train - INFO - Step 17000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 20:17:49 - pico-train - INFO - Step 17000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 20:17:49 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 20:17:50 - pico-train - INFO - Step 17000 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:17:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1846
2025-08-30 20:17:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-04
2025-08-30 20:17:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:17:50 - pico-train - INFO - Step 17000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 20:19:04 - pico-train - INFO - Step 17100 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:19:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1825
2025-08-30 20:19:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-04
2025-08-30 20:19:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:20:16 - pico-train - INFO - Step 17200 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:20:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1567
2025-08-30 20:20:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-04
2025-08-30 20:20:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:21:27 - pico-train - INFO - Step 17300 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:21:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.2075
2025-08-30 20:21:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-04
2025-08-30 20:21:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:22:39 - pico-train - INFO - Step 17400 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:22:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1649
2025-08-30 20:22:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-04
2025-08-30 20:22:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:23:49 - pico-train - INFO - Step 17500 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:23:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1506
2025-08-30 20:23:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-04
2025-08-30 20:23:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:25:01 - pico-train - INFO - Step 17600 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:25:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1757
2025-08-30 20:25:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-04
2025-08-30 20:25:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:26:12 - pico-train - INFO - Step 17700 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:26:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1580
2025-08-30 20:26:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-04
2025-08-30 20:26:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:27:24 - pico-train - INFO - Step 17800 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:27:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1308
2025-08-30 20:27:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-04
2025-08-30 20:27:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:28:35 - pico-train - INFO - Step 17900 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:28:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1600
2025-08-30 20:28:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-04
2025-08-30 20:28:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:29:45 - pico-train - INFO - Step 18000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 20:31:47 - pico-train - INFO - Step 18000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 20:31:47 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 20:31:48 - pico-train - INFO - Step 18000 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:31:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1611
2025-08-30 20:31:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-04
2025-08-30 20:31:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:31:48 - pico-train - INFO - Step 18000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 20:33:02 - pico-train - INFO - Step 18100 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:33:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1555
2025-08-30 20:33:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-04
2025-08-30 20:33:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:34:13 - pico-train - INFO - Step 18200 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:34:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1405
2025-08-30 20:34:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-04
2025-08-30 20:34:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:35:25 - pico-train - INFO - Step 18300 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:35:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1411
2025-08-30 20:35:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-04
2025-08-30 20:35:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:36:35 - pico-train - INFO - Step 18400 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:36:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1467
2025-08-30 20:36:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-04
2025-08-30 20:36:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:37:46 - pico-train - INFO - Step 18500 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:37:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1310
2025-08-30 20:37:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-04
2025-08-30 20:37:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:38:58 - pico-train - INFO - Step 18600 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:38:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1406
2025-08-30 20:38:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-04
2025-08-30 20:38:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:40:09 - pico-train - INFO - Step 18700 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:40:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1456
2025-08-30 20:40:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-04
2025-08-30 20:40:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:41:21 - pico-train - INFO - Step 18800 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:41:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1176
2025-08-30 20:41:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-04
2025-08-30 20:41:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:42:31 - pico-train - INFO - Step 18900 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:42:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1282
2025-08-30 20:42:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-04
2025-08-30 20:42:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:43:42 - pico-train - INFO - Step 19000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 20:45:44 - pico-train - INFO - Step 19000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 20:45:44 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 20:45:45 - pico-train - INFO - Step 19000 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:45:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1459
2025-08-30 20:45:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-04
2025-08-30 20:45:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:45:45 - pico-train - INFO - Step 19000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 20:46:59 - pico-train - INFO - Step 19100 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:46:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1400
2025-08-30 20:46:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-04
2025-08-30 20:46:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:48:10 - pico-train - INFO - Step 19200 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:48:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1267
2025-08-30 20:48:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-04
2025-08-30 20:48:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:49:21 - pico-train - INFO - Step 19300 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:49:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1289
2025-08-30 20:49:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-04
2025-08-30 20:49:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:50:32 - pico-train - INFO - Step 19400 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:50:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1263
2025-08-30 20:50:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-04
2025-08-30 20:50:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:51:44 - pico-train - INFO - Step 19500 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:51:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1383
2025-08-30 20:51:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-04
2025-08-30 20:51:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:52:55 - pico-train - INFO - Step 19600 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:52:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1403
2025-08-30 20:52:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-04
2025-08-30 20:52:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:54:06 - pico-train - INFO - Step 19700 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:54:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1260
2025-08-30 20:54:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-04
2025-08-30 20:54:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:55:17 - pico-train - INFO - Step 19800 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:55:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1311
2025-08-30 20:55:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-04
2025-08-30 20:55:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:56:28 - pico-train - INFO - Step 19900 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:56:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1058
2025-08-30 20:56:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-04
2025-08-30 20:56:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:57:38 - pico-train - INFO - Step 20000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 20:59:40 - pico-train - INFO - Step 20000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 20:59:40 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 20:59:41 - pico-train - INFO - Step 20000 -- ๐Ÿ”„ Training Metrics
2025-08-30 20:59:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1196
2025-08-30 20:59:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-04
2025-08-30 20:59:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 20:59:41 - pico-train - INFO - Step 20000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 21:00:55 - pico-train - INFO - Step 20100 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:00:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1019
2025-08-30 21:00:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-04
2025-08-30 21:00:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:02:06 - pico-train - INFO - Step 20200 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:02:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1164
2025-08-30 21:02:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-04
2025-08-30 21:02:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:03:17 - pico-train - INFO - Step 20300 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:03:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1355
2025-08-30 21:03:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-04
2025-08-30 21:03:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:04:28 - pico-train - INFO - Step 20400 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:04:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1157
2025-08-30 21:04:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-04
2025-08-30 21:04:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:05:40 - pico-train - INFO - Step 20500 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:05:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1164
2025-08-30 21:05:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-04
2025-08-30 21:05:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:06:51 - pico-train - INFO - Step 20600 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:06:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1290
2025-08-30 21:06:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-04
2025-08-30 21:06:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:08:02 - pico-train - INFO - Step 20700 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:08:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1013
2025-08-30 21:08:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-04
2025-08-30 21:08:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:09:13 - pico-train - INFO - Step 20800 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:09:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1103
2025-08-30 21:09:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-04
2025-08-30 21:09:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:10:24 - pico-train - INFO - Step 20900 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:10:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1041
2025-08-30 21:10:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-04
2025-08-30 21:10:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:11:34 - pico-train - INFO - Step 21000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 21:13:36 - pico-train - INFO - Step 21000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 21:13:36 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 21:13:37 - pico-train - INFO - Step 21000 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:13:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1056
2025-08-30 21:13:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-04
2025-08-30 21:13:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:13:37 - pico-train - INFO - Step 21000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 21:14:50 - pico-train - INFO - Step 21100 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:14:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0963
2025-08-30 21:14:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-04
2025-08-30 21:14:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:16:01 - pico-train - INFO - Step 21200 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:16:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1159
2025-08-30 21:16:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-04
2025-08-30 21:16:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:17:12 - pico-train - INFO - Step 21300 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:17:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1047
2025-08-30 21:17:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-04
2025-08-30 21:17:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:18:23 - pico-train - INFO - Step 21400 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:18:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0805
2025-08-30 21:18:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-04
2025-08-30 21:18:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:19:35 - pico-train - INFO - Step 21500 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:19:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1073
2025-08-30 21:19:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-04
2025-08-30 21:19:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:20:45 - pico-train - INFO - Step 21600 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:20:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0786
2025-08-30 21:20:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-04
2025-08-30 21:20:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:21:56 - pico-train - INFO - Step 21700 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:21:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0965
2025-08-30 21:21:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-04
2025-08-30 21:21:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:23:09 - pico-train - INFO - Step 21800 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:23:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1086
2025-08-30 21:23:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-04
2025-08-30 21:23:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:24:20 - pico-train - INFO - Step 21900 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:24:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.1030
2025-08-30 21:24:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-04
2025-08-30 21:24:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:25:30 - pico-train - INFO - Step 22000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 21:27:31 - pico-train - INFO - Step 22000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 21:27:31 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 21:27:32 - pico-train - INFO - Step 22000 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:27:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0792
2025-08-30 21:27:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-04
2025-08-30 21:27:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:27:32 - pico-train - INFO - Step 22000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 21:28:46 - pico-train - INFO - Step 22100 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:28:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0833
2025-08-30 21:28:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-04
2025-08-30 21:28:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:29:57 - pico-train - INFO - Step 22200 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:29:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0819
2025-08-30 21:29:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-04
2025-08-30 21:29:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:31:08 - pico-train - INFO - Step 22300 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:31:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0940
2025-08-30 21:31:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-04
2025-08-30 21:31:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:32:20 - pico-train - INFO - Step 22400 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:32:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0780
2025-08-30 21:32:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-04
2025-08-30 21:32:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:33:30 - pico-train - INFO - Step 22500 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:33:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0809
2025-08-30 21:33:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-04
2025-08-30 21:33:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:34:42 - pico-train - INFO - Step 22600 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:34:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0869
2025-08-30 21:34:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-04
2025-08-30 21:34:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:35:53 - pico-train - INFO - Step 22700 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:35:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0751
2025-08-30 21:35:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-04
2025-08-30 21:35:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:37:04 - pico-train - INFO - Step 22800 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:37:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0663
2025-08-30 21:37:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-04
2025-08-30 21:37:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:38:15 - pico-train - INFO - Step 22900 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:38:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0716
2025-08-30 21:38:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-04
2025-08-30 21:38:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:39:25 - pico-train - INFO - Step 23000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 21:41:27 - pico-train - INFO - Step 23000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 21:41:27 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 21:41:27 - pico-train - INFO - Step 23000 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:41:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0719
2025-08-30 21:41:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-04
2025-08-30 21:41:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:41:27 - pico-train - INFO - Step 23000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 21:42:42 - pico-train - INFO - Step 23100 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:42:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0658
2025-08-30 21:42:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-04
2025-08-30 21:42:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:43:53 - pico-train - INFO - Step 23200 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:43:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0631
2025-08-30 21:43:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-04
2025-08-30 21:43:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:45:05 - pico-train - INFO - Step 23300 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:45:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0873
2025-08-30 21:45:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-04
2025-08-30 21:45:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:46:16 - pico-train - INFO - Step 23400 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:46:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0612
2025-08-30 21:46:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-04
2025-08-30 21:46:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:47:27 - pico-train - INFO - Step 23500 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:47:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0464
2025-08-30 21:47:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-04
2025-08-30 21:47:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:48:38 - pico-train - INFO - Step 23600 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:48:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0612
2025-08-30 21:48:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-04
2025-08-30 21:48:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:49:50 - pico-train - INFO - Step 23700 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:49:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0501
2025-08-30 21:49:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-04
2025-08-30 21:49:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:51:01 - pico-train - INFO - Step 23800 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:51:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0517
2025-08-30 21:51:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-04
2025-08-30 21:51:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:52:12 - pico-train - INFO - Step 23900 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:52:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0439
2025-08-30 21:52:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-04
2025-08-30 21:52:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:53:22 - pico-train - INFO - Step 24000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 21:55:24 - pico-train - INFO - Step 24000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 21:55:24 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 21:55:25 - pico-train - INFO - Step 24000 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:55:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0522
2025-08-30 21:55:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-04
2025-08-30 21:55:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:55:25 - pico-train - INFO - Step 24000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 21:56:39 - pico-train - INFO - Step 24100 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:56:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0476
2025-08-30 21:56:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-04
2025-08-30 21:56:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:57:50 - pico-train - INFO - Step 24200 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:57:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0433
2025-08-30 21:57:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-04
2025-08-30 21:57:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 21:59:01 - pico-train - INFO - Step 24300 -- ๐Ÿ”„ Training Metrics
2025-08-30 21:59:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0522
2025-08-30 21:59:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-04
2025-08-30 21:59:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:00:12 - pico-train - INFO - Step 24400 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:00:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0452
2025-08-30 22:00:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.75e-04
2025-08-30 22:00:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:01:24 - pico-train - INFO - Step 24500 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:01:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0628
2025-08-30 22:01:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.75e-04
2025-08-30 22:01:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:02:35 - pico-train - INFO - Step 24600 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:02:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0439
2025-08-30 22:02:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.75e-04
2025-08-30 22:02:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:03:46 - pico-train - INFO - Step 24700 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:03:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0321
2025-08-30 22:03:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.75e-04
2025-08-30 22:03:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:04:57 - pico-train - INFO - Step 24800 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:04:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0503
2025-08-30 22:04:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-04
2025-08-30 22:04:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:06:08 - pico-train - INFO - Step 24900 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:06:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0360
2025-08-30 22:06:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-04
2025-08-30 22:06:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:07:18 - pico-train - INFO - Step 25000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 22:09:20 - pico-train - INFO - Step 25000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 22:09:20 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 22:09:21 - pico-train - INFO - Step 25000 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:09:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0212
2025-08-30 22:09:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-04
2025-08-30 22:09:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:09:21 - pico-train - INFO - Step 25000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 22:10:35 - pico-train - INFO - Step 25100 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:10:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0438
2025-08-30 22:10:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-04
2025-08-30 22:10:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:11:46 - pico-train - INFO - Step 25200 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:11:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0433
2025-08-30 22:11:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-04
2025-08-30 22:11:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:12:56 - pico-train - INFO - Step 25300 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:12:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0521
2025-08-30 22:12:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-04
2025-08-30 22:12:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:14:08 - pico-train - INFO - Step 25400 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:14:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0397
2025-08-30 22:14:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-04
2025-08-30 22:14:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:15:19 - pico-train - INFO - Step 25500 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:15:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0390
2025-08-30 22:15:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-04
2025-08-30 22:15:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:16:31 - pico-train - INFO - Step 25600 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:16:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0488
2025-08-30 22:16:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-04
2025-08-30 22:16:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:17:42 - pico-train - INFO - Step 25700 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:17:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0378
2025-08-30 22:17:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-04
2025-08-30 22:17:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:18:53 - pico-train - INFO - Step 25800 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:18:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0415
2025-08-30 22:18:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.72e-04
2025-08-30 22:18:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:20:04 - pico-train - INFO - Step 25900 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:20:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0105
2025-08-30 22:20:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.72e-04
2025-08-30 22:20:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:21:15 - pico-train - INFO - Step 26000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 22:23:16 - pico-train - INFO - Step 26000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 22:23:16 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 22:23:17 - pico-train - INFO - Step 26000 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:23:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0128
2025-08-30 22:23:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.72e-04
2025-08-30 22:23:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:23:17 - pico-train - INFO - Step 26000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 22:24:31 - pico-train - INFO - Step 26100 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:24:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0559
2025-08-30 22:24:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.72e-04
2025-08-30 22:24:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:25:42 - pico-train - INFO - Step 26200 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:25:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0232
2025-08-30 22:25:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.71e-04
2025-08-30 22:25:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:26:53 - pico-train - INFO - Step 26300 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:26:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0204
2025-08-30 22:26:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.71e-04
2025-08-30 22:26:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:28:04 - pico-train - INFO - Step 26400 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:28:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0435
2025-08-30 22:28:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.71e-04
2025-08-30 22:28:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:29:15 - pico-train - INFO - Step 26500 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:29:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0079
2025-08-30 22:29:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.71e-04
2025-08-30 22:29:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:30:26 - pico-train - INFO - Step 26600 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:30:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0246
2025-08-30 22:30:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-04
2025-08-30 22:30:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:31:37 - pico-train - INFO - Step 26700 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:31:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0400
2025-08-30 22:31:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-04
2025-08-30 22:31:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:32:49 - pico-train - INFO - Step 26800 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:32:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0357
2025-08-30 22:32:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-04
2025-08-30 22:32:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:34:01 - pico-train - INFO - Step 26900 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:34:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9973
2025-08-30 22:34:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-04
2025-08-30 22:34:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:35:11 - pico-train - INFO - Step 27000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 22:37:12 - pico-train - INFO - Step 27000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 22:37:12 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 22:37:13 - pico-train - INFO - Step 27000 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:37:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0186
2025-08-30 22:37:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-04
2025-08-30 22:37:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:37:13 - pico-train - INFO - Step 27000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 22:38:27 - pico-train - INFO - Step 27100 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:38:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0100
2025-08-30 22:38:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.69e-04
2025-08-30 22:38:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:39:38 - pico-train - INFO - Step 27200 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:39:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0227
2025-08-30 22:39:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.69e-04
2025-08-30 22:39:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:40:49 - pico-train - INFO - Step 27300 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:40:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0293
2025-08-30 22:40:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.69e-04
2025-08-30 22:40:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:42:00 - pico-train - INFO - Step 27400 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:42:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0011
2025-08-30 22:42:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.69e-04
2025-08-30 22:42:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:43:11 - pico-train - INFO - Step 27500 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:43:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0173
2025-08-30 22:43:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.68e-04
2025-08-30 22:43:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:44:22 - pico-train - INFO - Step 27600 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:44:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9923
2025-08-30 22:44:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.68e-04
2025-08-30 22:44:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:45:34 - pico-train - INFO - Step 27700 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:45:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0067
2025-08-30 22:45:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.68e-04
2025-08-30 22:45:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:46:45 - pico-train - INFO - Step 27800 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:46:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0232
2025-08-30 22:46:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.68e-04
2025-08-30 22:46:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:47:56 - pico-train - INFO - Step 27900 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:47:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0189
2025-08-30 22:47:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-04
2025-08-30 22:47:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:49:06 - pico-train - INFO - Step 28000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 22:51:07 - pico-train - INFO - Step 28000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 22:51:07 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 22:51:08 - pico-train - INFO - Step 28000 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:51:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0086
2025-08-30 22:51:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-04
2025-08-30 22:51:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:51:08 - pico-train - INFO - Step 28000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 22:52:22 - pico-train - INFO - Step 28100 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:52:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9984
2025-08-30 22:52:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-04
2025-08-30 22:52:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:53:34 - pico-train - INFO - Step 28200 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:53:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0119
2025-08-30 22:53:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-04
2025-08-30 22:53:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:54:45 - pico-train - INFO - Step 28300 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:54:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9886
2025-08-30 22:54:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-04
2025-08-30 22:54:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:55:56 - pico-train - INFO - Step 28400 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:55:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0118
2025-08-30 22:55:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.66e-04
2025-08-30 22:55:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:57:07 - pico-train - INFO - Step 28500 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:57:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0120
2025-08-30 22:57:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.66e-04
2025-08-30 22:57:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:58:18 - pico-train - INFO - Step 28600 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:58:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0077
2025-08-30 22:58:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.66e-04
2025-08-30 22:58:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 22:59:30 - pico-train - INFO - Step 28700 -- ๐Ÿ”„ Training Metrics
2025-08-30 22:59:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0100
2025-08-30 22:59:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.66e-04
2025-08-30 22:59:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:00:41 - pico-train - INFO - Step 28800 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:00:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0086
2025-08-30 23:00:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.65e-04
2025-08-30 23:00:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:01:52 - pico-train - INFO - Step 28900 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:01:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9909
2025-08-30 23:01:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.65e-04
2025-08-30 23:01:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:03:02 - pico-train - INFO - Step 29000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 23:05:04 - pico-train - INFO - Step 29000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 23:05:04 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 23:05:05 - pico-train - INFO - Step 29000 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:05:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9947
2025-08-30 23:05:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.65e-04
2025-08-30 23:05:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:05:05 - pico-train - INFO - Step 29000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 23:06:20 - pico-train - INFO - Step 29100 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:06:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0001
2025-08-30 23:06:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.65e-04
2025-08-30 23:06:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:07:31 - pico-train - INFO - Step 29200 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:07:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9992
2025-08-30 23:07:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.64e-04
2025-08-30 23:07:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:08:43 - pico-train - INFO - Step 29300 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:08:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9885
2025-08-30 23:08:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.64e-04
2025-08-30 23:08:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:09:53 - pico-train - INFO - Step 29400 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:09:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0013
2025-08-30 23:09:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.64e-04
2025-08-30 23:09:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:11:06 - pico-train - INFO - Step 29500 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:11:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9933
2025-08-30 23:11:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.64e-04
2025-08-30 23:11:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:12:17 - pico-train - INFO - Step 29600 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:12:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0078
2025-08-30 23:12:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.63e-04
2025-08-30 23:12:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:13:28 - pico-train - INFO - Step 29700 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:13:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9921
2025-08-30 23:13:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.63e-04
2025-08-30 23:13:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:14:39 - pico-train - INFO - Step 29800 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:14:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.0129
2025-08-30 23:14:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.63e-04
2025-08-30 23:14:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:15:50 - pico-train - INFO - Step 29900 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:15:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9857
2025-08-30 23:15:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.63e-04
2025-08-30 23:15:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:17:01 - pico-train - INFO - Step 30000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 23:19:03 - pico-train - INFO - Step 30000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 23:19:03 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 23:19:03 - pico-train - INFO - Step 30000 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:19:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9900
2025-08-30 23:19:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.62e-04
2025-08-30 23:19:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:19:03 - pico-train - INFO - Step 30000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 23:20:21 - pico-train - INFO - Step 30100 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:20:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9847
2025-08-30 23:20:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.62e-04
2025-08-30 23:20:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:21:32 - pico-train - INFO - Step 30200 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:21:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9702
2025-08-30 23:21:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.62e-04
2025-08-30 23:21:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:22:43 - pico-train - INFO - Step 30300 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:22:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9776
2025-08-30 23:22:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.62e-04
2025-08-30 23:22:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:23:54 - pico-train - INFO - Step 30400 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:23:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9789
2025-08-30 23:23:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.61e-04
2025-08-30 23:23:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:25:05 - pico-train - INFO - Step 30500 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:25:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9794
2025-08-30 23:25:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.61e-04
2025-08-30 23:25:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:26:17 - pico-train - INFO - Step 30600 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:26:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9955
2025-08-30 23:26:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.61e-04
2025-08-30 23:26:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:27:28 - pico-train - INFO - Step 30700 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:27:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9794
2025-08-30 23:27:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.61e-04
2025-08-30 23:27:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:28:39 - pico-train - INFO - Step 30800 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:28:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9636
2025-08-30 23:28:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-04
2025-08-30 23:28:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:29:51 - pico-train - INFO - Step 30900 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:29:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9775
2025-08-30 23:29:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-04
2025-08-30 23:29:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:31:01 - pico-train - INFO - Step 31000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 23:33:03 - pico-train - INFO - Step 31000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 23:33:03 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 23:33:04 - pico-train - INFO - Step 31000 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:33:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9882
2025-08-30 23:33:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-04
2025-08-30 23:33:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:33:04 - pico-train - INFO - Step 31000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 23:34:18 - pico-train - INFO - Step 31100 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:34:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9492
2025-08-30 23:34:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-04
2025-08-30 23:34:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:35:29 - pico-train - INFO - Step 31200 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:35:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9782
2025-08-30 23:35:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.59e-04
2025-08-30 23:35:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:36:40 - pico-train - INFO - Step 31300 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:36:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9733
2025-08-30 23:36:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.59e-04
2025-08-30 23:36:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:37:51 - pico-train - INFO - Step 31400 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:37:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9639
2025-08-30 23:37:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.59e-04
2025-08-30 23:37:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:39:03 - pico-train - INFO - Step 31500 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:39:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9706
2025-08-30 23:39:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.59e-04
2025-08-30 23:39:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:40:14 - pico-train - INFO - Step 31600 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:40:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9917
2025-08-30 23:40:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.58e-04
2025-08-30 23:40:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:41:25 - pico-train - INFO - Step 31700 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:41:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9708
2025-08-30 23:41:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.58e-04
2025-08-30 23:41:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:42:36 - pico-train - INFO - Step 31800 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:42:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9446
2025-08-30 23:42:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.58e-04
2025-08-30 23:42:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:43:48 - pico-train - INFO - Step 31900 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:43:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9893
2025-08-30 23:43:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.57e-04
2025-08-30 23:43:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:44:58 - pico-train - INFO - Step 32000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 23:47:00 - pico-train - INFO - Step 32000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 23:47:00 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-30 23:47:01 - pico-train - INFO - Step 32000 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:47:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9584
2025-08-30 23:47:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.57e-04
2025-08-30 23:47:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:47:01 - pico-train - INFO - Step 32000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 23:48:14 - pico-train - INFO - Step 32100 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:48:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9840
2025-08-30 23:48:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.57e-04
2025-08-30 23:48:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:49:26 - pico-train - INFO - Step 32200 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:49:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9499
2025-08-30 23:49:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.57e-04
2025-08-30 23:49:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:50:37 - pico-train - INFO - Step 32300 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:50:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9514
2025-08-30 23:50:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.56e-04
2025-08-30 23:50:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:51:48 - pico-train - INFO - Step 32400 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:51:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9563
2025-08-30 23:51:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.56e-04
2025-08-30 23:51:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:52:59 - pico-train - INFO - Step 32500 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:52:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9438
2025-08-30 23:52:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.56e-04
2025-08-30 23:52:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:54:10 - pico-train - INFO - Step 32600 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:54:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9769
2025-08-30 23:54:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.56e-04
2025-08-30 23:54:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:55:21 - pico-train - INFO - Step 32700 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:55:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9642
2025-08-30 23:55:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.55e-04
2025-08-30 23:55:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:56:32 - pico-train - INFO - Step 32800 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:56:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9569
2025-08-30 23:56:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.55e-04
2025-08-30 23:56:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:57:43 - pico-train - INFO - Step 32900 -- ๐Ÿ”„ Training Metrics
2025-08-30 23:57:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9885
2025-08-30 23:57:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.55e-04
2025-08-30 23:57:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 23:58:53 - pico-train - INFO - Step 33000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 00:00:54 - pico-train - INFO - Step 33000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 00:00:54 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 00:00:55 - pico-train - INFO - Step 33000 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:00:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9527
2025-08-31 00:00:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.55e-04
2025-08-31 00:00:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:00:55 - pico-train - INFO - Step 33000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 00:02:09 - pico-train - INFO - Step 33100 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:02:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9572
2025-08-31 00:02:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.54e-04
2025-08-31 00:02:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:03:20 - pico-train - INFO - Step 33200 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:03:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9849
2025-08-31 00:03:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.54e-04
2025-08-31 00:03:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:04:32 - pico-train - INFO - Step 33300 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:04:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9590
2025-08-31 00:04:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.54e-04
2025-08-31 00:04:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:05:42 - pico-train - INFO - Step 33400 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:05:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9521
2025-08-31 00:05:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.53e-04
2025-08-31 00:05:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:06:54 - pico-train - INFO - Step 33500 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:06:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9655
2025-08-31 00:06:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.53e-04
2025-08-31 00:06:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:08:05 - pico-train - INFO - Step 33600 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:08:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9481
2025-08-31 00:08:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.53e-04
2025-08-31 00:08:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:09:16 - pico-train - INFO - Step 33700 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:09:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9613
2025-08-31 00:09:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.53e-04
2025-08-31 00:09:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:10:29 - pico-train - INFO - Step 33800 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:10:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9348
2025-08-31 00:10:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.52e-04
2025-08-31 00:10:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:11:39 - pico-train - INFO - Step 33900 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:11:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9517
2025-08-31 00:11:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.52e-04
2025-08-31 00:11:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:12:50 - pico-train - INFO - Step 34000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 00:14:51 - pico-train - INFO - Step 34000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 00:14:51 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 00:14:52 - pico-train - INFO - Step 34000 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:14:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9518
2025-08-31 00:14:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.52e-04
2025-08-31 00:14:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:14:52 - pico-train - INFO - Step 34000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 00:16:06 - pico-train - INFO - Step 34100 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:16:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9583
2025-08-31 00:16:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.52e-04
2025-08-31 00:16:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:17:17 - pico-train - INFO - Step 34200 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:17:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9762
2025-08-31 00:17:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.51e-04
2025-08-31 00:17:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:18:29 - pico-train - INFO - Step 34300 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:18:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9672
2025-08-31 00:18:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.51e-04
2025-08-31 00:18:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:19:40 - pico-train - INFO - Step 34400 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:19:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9656
2025-08-31 00:19:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.51e-04
2025-08-31 00:19:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:20:51 - pico-train - INFO - Step 34500 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:20:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9669
2025-08-31 00:20:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-04
2025-08-31 00:20:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:22:03 - pico-train - INFO - Step 34600 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:22:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9462
2025-08-31 00:22:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-04
2025-08-31 00:22:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:23:14 - pico-train - INFO - Step 34700 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:23:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9413
2025-08-31 00:23:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-04
2025-08-31 00:23:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:24:25 - pico-train - INFO - Step 34800 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:24:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9357
2025-08-31 00:24:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-04
2025-08-31 00:24:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:25:36 - pico-train - INFO - Step 34900 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:25:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9346
2025-08-31 00:25:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.49e-04
2025-08-31 00:25:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:26:47 - pico-train - INFO - Step 35000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 00:28:48 - pico-train - INFO - Step 35000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 00:28:48 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 00:28:49 - pico-train - INFO - Step 35000 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:28:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9402
2025-08-31 00:28:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.49e-04
2025-08-31 00:28:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:28:49 - pico-train - INFO - Step 35000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 00:30:03 - pico-train - INFO - Step 35100 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:30:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9421
2025-08-31 00:30:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.49e-04
2025-08-31 00:30:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:31:14 - pico-train - INFO - Step 35200 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:31:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9274
2025-08-31 00:31:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.49e-04
2025-08-31 00:31:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:32:25 - pico-train - INFO - Step 35300 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:32:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9681
2025-08-31 00:32:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.48e-04
2025-08-31 00:32:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:33:37 - pico-train - INFO - Step 35400 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:33:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9664
2025-08-31 00:33:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.48e-04
2025-08-31 00:33:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:34:48 - pico-train - INFO - Step 35500 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:34:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9410
2025-08-31 00:34:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.48e-04
2025-08-31 00:34:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:35:59 - pico-train - INFO - Step 35600 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:35:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9500
2025-08-31 00:35:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.47e-04
2025-08-31 00:35:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:37:11 - pico-train - INFO - Step 35700 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:37:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9276
2025-08-31 00:37:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.47e-04
2025-08-31 00:37:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:38:22 - pico-train - INFO - Step 35800 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:38:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9672
2025-08-31 00:38:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.47e-04
2025-08-31 00:38:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:39:34 - pico-train - INFO - Step 35900 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:39:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9397
2025-08-31 00:39:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.47e-04
2025-08-31 00:39:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:40:45 - pico-train - INFO - Step 36000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 00:42:47 - pico-train - INFO - Step 36000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 00:42:47 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 00:42:48 - pico-train - INFO - Step 36000 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:42:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9613
2025-08-31 00:42:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.46e-04
2025-08-31 00:42:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:42:48 - pico-train - INFO - Step 36000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 00:44:02 - pico-train - INFO - Step 36100 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:44:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9351
2025-08-31 00:44:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.46e-04
2025-08-31 00:44:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:45:13 - pico-train - INFO - Step 36200 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:45:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9432
2025-08-31 00:45:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.46e-04
2025-08-31 00:45:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:46:24 - pico-train - INFO - Step 36300 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:46:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9407
2025-08-31 00:46:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.45e-04
2025-08-31 00:46:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:47:36 - pico-train - INFO - Step 36400 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:47:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9587
2025-08-31 00:47:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.45e-04
2025-08-31 00:47:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:48:48 - pico-train - INFO - Step 36500 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:48:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9297
2025-08-31 00:48:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.45e-04
2025-08-31 00:48:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:49:59 - pico-train - INFO - Step 36600 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:49:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9252
2025-08-31 00:49:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.45e-04
2025-08-31 00:49:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:51:10 - pico-train - INFO - Step 36700 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:51:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9334
2025-08-31 00:51:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.44e-04
2025-08-31 00:51:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:52:22 - pico-train - INFO - Step 36800 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:52:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9394
2025-08-31 00:52:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.44e-04
2025-08-31 00:52:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:53:33 - pico-train - INFO - Step 36900 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:53:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9518
2025-08-31 00:53:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.44e-04
2025-08-31 00:53:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:54:44 - pico-train - INFO - Step 37000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 00:56:45 - pico-train - INFO - Step 37000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 00:56:45 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 00:56:46 - pico-train - INFO - Step 37000 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:56:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9360
2025-08-31 00:56:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.43e-04
2025-08-31 00:56:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:56:46 - pico-train - INFO - Step 37000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 00:58:00 - pico-train - INFO - Step 37100 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:58:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9355
2025-08-31 00:58:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.43e-04
2025-08-31 00:58:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 00:59:12 - pico-train - INFO - Step 37200 -- ๐Ÿ”„ Training Metrics
2025-08-31 00:59:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9187
2025-08-31 00:59:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.43e-04
2025-08-31 00:59:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:00:23 - pico-train - INFO - Step 37300 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:00:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9427
2025-08-31 01:00:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.43e-04
2025-08-31 01:00:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:01:35 - pico-train - INFO - Step 37400 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:01:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9359
2025-08-31 01:01:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.42e-04
2025-08-31 01:01:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:02:46 - pico-train - INFO - Step 37500 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:02:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9205
2025-08-31 01:02:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.42e-04
2025-08-31 01:02:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:03:57 - pico-train - INFO - Step 37600 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:03:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9373
2025-08-31 01:03:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.42e-04
2025-08-31 01:03:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:05:08 - pico-train - INFO - Step 37700 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:05:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9389
2025-08-31 01:05:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.41e-04
2025-08-31 01:05:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:06:20 - pico-train - INFO - Step 37800 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:06:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9301
2025-08-31 01:06:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.41e-04
2025-08-31 01:06:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:07:31 - pico-train - INFO - Step 37900 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:07:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9262
2025-08-31 01:07:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.41e-04
2025-08-31 01:07:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:08:42 - pico-train - INFO - Step 38000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 01:10:43 - pico-train - INFO - Step 38000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 01:10:43 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 01:10:44 - pico-train - INFO - Step 38000 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:10:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9292
2025-08-31 01:10:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-04
2025-08-31 01:10:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:10:44 - pico-train - INFO - Step 38000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 01:11:58 - pico-train - INFO - Step 38100 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:11:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9439
2025-08-31 01:11:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-04
2025-08-31 01:11:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:13:09 - pico-train - INFO - Step 38200 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:13:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9389
2025-08-31 01:13:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-04
2025-08-31 01:13:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:14:20 - pico-train - INFO - Step 38300 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:14:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9439
2025-08-31 01:14:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-04
2025-08-31 01:14:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:15:33 - pico-train - INFO - Step 38400 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:15:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9466
2025-08-31 01:15:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.39e-04
2025-08-31 01:15:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:16:43 - pico-train - INFO - Step 38500 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:16:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9536
2025-08-31 01:16:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.39e-04
2025-08-31 01:16:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:17:55 - pico-train - INFO - Step 38600 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:17:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9061
2025-08-31 01:17:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.39e-04
2025-08-31 01:17:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:19:07 - pico-train - INFO - Step 38700 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:19:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9307
2025-08-31 01:19:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.38e-04
2025-08-31 01:19:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:20:18 - pico-train - INFO - Step 38800 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:20:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9027
2025-08-31 01:20:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.38e-04
2025-08-31 01:20:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:21:30 - pico-train - INFO - Step 38900 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:21:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9224
2025-08-31 01:21:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.38e-04
2025-08-31 01:21:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:22:40 - pico-train - INFO - Step 39000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 01:24:41 - pico-train - INFO - Step 39000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 01:24:41 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 01:24:42 - pico-train - INFO - Step 39000 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:24:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9213
2025-08-31 01:24:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.38e-04
2025-08-31 01:24:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:24:42 - pico-train - INFO - Step 39000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 01:25:56 - pico-train - INFO - Step 39100 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:25:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9162
2025-08-31 01:25:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.37e-04
2025-08-31 01:25:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:27:08 - pico-train - INFO - Step 39200 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:27:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9188
2025-08-31 01:27:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.37e-04
2025-08-31 01:27:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:28:19 - pico-train - INFO - Step 39300 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:28:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9239
2025-08-31 01:28:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.37e-04
2025-08-31 01:28:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:29:31 - pico-train - INFO - Step 39400 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:29:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9180
2025-08-31 01:29:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.36e-04
2025-08-31 01:29:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:30:42 - pico-train - INFO - Step 39500 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:30:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9127
2025-08-31 01:30:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.36e-04
2025-08-31 01:30:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:31:53 - pico-train - INFO - Step 39600 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:31:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9226
2025-08-31 01:31:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.36e-04
2025-08-31 01:31:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:33:05 - pico-train - INFO - Step 39700 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:33:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9442
2025-08-31 01:33:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.35e-04
2025-08-31 01:33:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:34:16 - pico-train - INFO - Step 39800 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:34:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9228
2025-08-31 01:34:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.35e-04
2025-08-31 01:34:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:35:27 - pico-train - INFO - Step 39900 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:35:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9176
2025-08-31 01:35:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.35e-04
2025-08-31 01:35:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:36:38 - pico-train - INFO - Step 40000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 01:38:39 - pico-train - INFO - Step 40000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 01:38:39 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 01:38:40 - pico-train - INFO - Step 40000 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:38:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9145
2025-08-31 01:38:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.35e-04
2025-08-31 01:38:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:38:40 - pico-train - INFO - Step 40000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 01:39:54 - pico-train - INFO - Step 40100 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:39:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9238
2025-08-31 01:39:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.34e-04
2025-08-31 01:39:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:41:05 - pico-train - INFO - Step 40200 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:41:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9169
2025-08-31 01:41:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.34e-04
2025-08-31 01:41:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:42:16 - pico-train - INFO - Step 40300 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:42:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9209
2025-08-31 01:42:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.34e-04
2025-08-31 01:42:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:43:28 - pico-train - INFO - Step 40400 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:43:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9120
2025-08-31 01:43:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.33e-04
2025-08-31 01:43:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:44:39 - pico-train - INFO - Step 40500 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:44:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9071
2025-08-31 01:44:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.33e-04
2025-08-31 01:44:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:45:50 - pico-train - INFO - Step 40600 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:45:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9123
2025-08-31 01:45:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.33e-04
2025-08-31 01:45:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:47:02 - pico-train - INFO - Step 40700 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:47:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9219
2025-08-31 01:47:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.32e-04
2025-08-31 01:47:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:48:13 - pico-train - INFO - Step 40800 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:48:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8844
2025-08-31 01:48:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.32e-04
2025-08-31 01:48:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:49:24 - pico-train - INFO - Step 40900 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:49:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9326
2025-08-31 01:49:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.32e-04
2025-08-31 01:49:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:50:35 - pico-train - INFO - Step 41000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 01:52:37 - pico-train - INFO - Step 41000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 01:52:37 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 01:52:38 - pico-train - INFO - Step 41000 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:52:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8746
2025-08-31 01:52:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.32e-04
2025-08-31 01:52:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:52:38 - pico-train - INFO - Step 41000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 01:53:52 - pico-train - INFO - Step 41100 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:53:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9076
2025-08-31 01:53:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.31e-04
2025-08-31 01:53:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:55:02 - pico-train - INFO - Step 41200 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:55:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9188
2025-08-31 01:55:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.31e-04
2025-08-31 01:55:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:56:14 - pico-train - INFO - Step 41300 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:56:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9055
2025-08-31 01:56:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.31e-04
2025-08-31 01:56:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:57:25 - pico-train - INFO - Step 41400 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:57:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8996
2025-08-31 01:57:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.30e-04
2025-08-31 01:57:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:58:36 - pico-train - INFO - Step 41500 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:58:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9231
2025-08-31 01:58:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.30e-04
2025-08-31 01:58:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 01:59:48 - pico-train - INFO - Step 41600 -- ๐Ÿ”„ Training Metrics
2025-08-31 01:59:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9032
2025-08-31 01:59:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.30e-04
2025-08-31 01:59:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:00:58 - pico-train - INFO - Step 41700 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:00:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9137
2025-08-31 02:00:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.29e-04
2025-08-31 02:00:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:02:10 - pico-train - INFO - Step 41800 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:02:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9201
2025-08-31 02:02:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.29e-04
2025-08-31 02:02:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:03:21 - pico-train - INFO - Step 41900 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:03:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8982
2025-08-31 02:03:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.29e-04
2025-08-31 02:03:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:04:32 - pico-train - INFO - Step 42000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 02:06:33 - pico-train - INFO - Step 42000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 02:06:33 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 02:06:34 - pico-train - INFO - Step 42000 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:06:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8823
2025-08-31 02:06:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.28e-04
2025-08-31 02:06:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:06:34 - pico-train - INFO - Step 42000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 02:07:48 - pico-train - INFO - Step 42100 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:07:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8966
2025-08-31 02:07:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.28e-04
2025-08-31 02:07:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:08:59 - pico-train - INFO - Step 42200 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:08:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8974
2025-08-31 02:08:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.28e-04
2025-08-31 02:08:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:10:12 - pico-train - INFO - Step 42300 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:10:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9273
2025-08-31 02:10:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.28e-04
2025-08-31 02:10:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:11:23 - pico-train - INFO - Step 42400 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:11:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9047
2025-08-31 02:11:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.27e-04
2025-08-31 02:11:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:12:34 - pico-train - INFO - Step 42500 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:12:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9010
2025-08-31 02:12:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.27e-04
2025-08-31 02:12:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:13:45 - pico-train - INFO - Step 42600 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:13:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9314
2025-08-31 02:13:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.27e-04
2025-08-31 02:13:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:14:57 - pico-train - INFO - Step 42700 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:14:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8923
2025-08-31 02:14:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.26e-04
2025-08-31 02:14:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:16:08 - pico-train - INFO - Step 42800 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:16:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9007
2025-08-31 02:16:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.26e-04
2025-08-31 02:16:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:17:19 - pico-train - INFO - Step 42900 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:17:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8903
2025-08-31 02:17:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.26e-04
2025-08-31 02:17:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:18:30 - pico-train - INFO - Step 43000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 02:20:32 - pico-train - INFO - Step 43000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 02:20:32 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 02:20:33 - pico-train - INFO - Step 43000 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:20:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9252
2025-08-31 02:20:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.25e-04
2025-08-31 02:20:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:20:33 - pico-train - INFO - Step 43000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 02:21:46 - pico-train - INFO - Step 43100 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:21:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8925
2025-08-31 02:21:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.25e-04
2025-08-31 02:21:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:22:57 - pico-train - INFO - Step 43200 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:22:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8954
2025-08-31 02:22:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.25e-04
2025-08-31 02:22:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:24:09 - pico-train - INFO - Step 43300 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:24:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8650
2025-08-31 02:24:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.24e-04
2025-08-31 02:24:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:25:21 - pico-train - INFO - Step 43400 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:25:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9019
2025-08-31 02:25:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.24e-04
2025-08-31 02:25:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:26:32 - pico-train - INFO - Step 43500 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:26:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9002
2025-08-31 02:26:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.24e-04
2025-08-31 02:26:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:27:44 - pico-train - INFO - Step 43600 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:27:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8981
2025-08-31 02:27:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.24e-04
2025-08-31 02:27:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:28:55 - pico-train - INFO - Step 43700 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:28:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9206
2025-08-31 02:28:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.23e-04
2025-08-31 02:28:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:30:06 - pico-train - INFO - Step 43800 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:30:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8934
2025-08-31 02:30:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.23e-04
2025-08-31 02:30:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:31:18 - pico-train - INFO - Step 43900 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:31:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8906
2025-08-31 02:31:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.23e-04
2025-08-31 02:31:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:32:28 - pico-train - INFO - Step 44000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 02:34:29 - pico-train - INFO - Step 44000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 02:34:29 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 02:34:30 - pico-train - INFO - Step 44000 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:34:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9013
2025-08-31 02:34:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.22e-04
2025-08-31 02:34:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:34:30 - pico-train - INFO - Step 44000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 02:35:44 - pico-train - INFO - Step 44100 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:35:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8915
2025-08-31 02:35:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.22e-04
2025-08-31 02:35:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:36:55 - pico-train - INFO - Step 44200 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:36:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8804
2025-08-31 02:36:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.22e-04
2025-08-31 02:36:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:38:06 - pico-train - INFO - Step 44300 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:38:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8928
2025-08-31 02:38:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.21e-04
2025-08-31 02:38:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:39:18 - pico-train - INFO - Step 44400 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:39:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8801
2025-08-31 02:39:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.21e-04
2025-08-31 02:39:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:40:29 - pico-train - INFO - Step 44500 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:40:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9167
2025-08-31 02:40:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.21e-04
2025-08-31 02:40:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:41:40 - pico-train - INFO - Step 44600 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:41:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8424
2025-08-31 02:41:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.20e-04
2025-08-31 02:41:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:42:51 - pico-train - INFO - Step 44700 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:42:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8779
2025-08-31 02:42:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.20e-04
2025-08-31 02:42:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:44:04 - pico-train - INFO - Step 44800 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:44:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9086
2025-08-31 02:44:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.20e-04
2025-08-31 02:44:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:45:14 - pico-train - INFO - Step 44900 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:45:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9030
2025-08-31 02:45:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.19e-04
2025-08-31 02:45:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:46:25 - pico-train - INFO - Step 45000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 02:48:27 - pico-train - INFO - Step 45000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 02:48:27 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 02:48:27 - pico-train - INFO - Step 45000 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:48:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8994
2025-08-31 02:48:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.19e-04
2025-08-31 02:48:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:48:27 - pico-train - INFO - Step 45000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 02:49:41 - pico-train - INFO - Step 45100 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:49:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8970
2025-08-31 02:49:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.19e-04
2025-08-31 02:49:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:50:53 - pico-train - INFO - Step 45200 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:50:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8832
2025-08-31 02:50:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.18e-04
2025-08-31 02:50:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:52:05 - pico-train - INFO - Step 45300 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:52:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8729
2025-08-31 02:52:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.18e-04
2025-08-31 02:52:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:53:16 - pico-train - INFO - Step 45400 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:53:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8860
2025-08-31 02:53:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.18e-04
2025-08-31 02:53:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:54:27 - pico-train - INFO - Step 45500 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:54:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.9128
2025-08-31 02:54:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.18e-04
2025-08-31 02:54:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:55:38 - pico-train - INFO - Step 45600 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:55:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8549
2025-08-31 02:55:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.17e-04
2025-08-31 02:55:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:56:50 - pico-train - INFO - Step 45700 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:56:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8901
2025-08-31 02:56:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.17e-04
2025-08-31 02:56:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:58:02 - pico-train - INFO - Step 45800 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:58:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8899
2025-08-31 02:58:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.17e-04
2025-08-31 02:58:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 02:59:12 - pico-train - INFO - Step 45900 -- ๐Ÿ”„ Training Metrics
2025-08-31 02:59:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8725
2025-08-31 02:59:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.16e-04
2025-08-31 02:59:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:00:23 - pico-train - INFO - Step 46000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 03:02:25 - pico-train - INFO - Step 46000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 03:02:25 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 03:02:26 - pico-train - INFO - Step 46000 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:02:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8771
2025-08-31 03:02:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.16e-04
2025-08-31 03:02:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:02:26 - pico-train - INFO - Step 46000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 03:03:41 - pico-train - INFO - Step 46100 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:03:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8648
2025-08-31 03:03:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.16e-04
2025-08-31 03:03:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:04:52 - pico-train - INFO - Step 46200 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:04:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8980
2025-08-31 03:04:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.15e-04
2025-08-31 03:04:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:06:03 - pico-train - INFO - Step 46300 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:06:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8868
2025-08-31 03:06:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.15e-04
2025-08-31 03:06:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:07:15 - pico-train - INFO - Step 46400 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:07:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8807
2025-08-31 03:07:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.15e-04
2025-08-31 03:07:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:08:27 - pico-train - INFO - Step 46500 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:08:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8779
2025-08-31 03:08:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.14e-04
2025-08-31 03:08:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:09:38 - pico-train - INFO - Step 46600 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:09:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8909
2025-08-31 03:09:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.14e-04
2025-08-31 03:09:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:10:49 - pico-train - INFO - Step 46700 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:10:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8883
2025-08-31 03:10:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.14e-04
2025-08-31 03:10:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:12:01 - pico-train - INFO - Step 46800 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:12:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8876
2025-08-31 03:12:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.13e-04
2025-08-31 03:12:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:13:12 - pico-train - INFO - Step 46900 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:13:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8686
2025-08-31 03:13:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.13e-04
2025-08-31 03:13:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:14:23 - pico-train - INFO - Step 47000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 03:16:24 - pico-train - INFO - Step 47000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 03:16:24 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 03:16:25 - pico-train - INFO - Step 47000 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:16:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8700
2025-08-31 03:16:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.13e-04
2025-08-31 03:16:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:16:25 - pico-train - INFO - Step 47000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 03:17:39 - pico-train - INFO - Step 47100 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:17:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8670
2025-08-31 03:17:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.12e-04
2025-08-31 03:17:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:18:49 - pico-train - INFO - Step 47200 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:18:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8849
2025-08-31 03:18:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.12e-04
2025-08-31 03:18:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:20:00 - pico-train - INFO - Step 47300 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:20:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8666
2025-08-31 03:20:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.12e-04
2025-08-31 03:20:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:21:12 - pico-train - INFO - Step 47400 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:21:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8596
2025-08-31 03:21:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.12e-04
2025-08-31 03:21:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:22:23 - pico-train - INFO - Step 47500 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:22:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8679
2025-08-31 03:22:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.11e-04
2025-08-31 03:22:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:23:34 - pico-train - INFO - Step 47600 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:23:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8866
2025-08-31 03:23:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.11e-04
2025-08-31 03:23:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:24:45 - pico-train - INFO - Step 47700 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:24:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8761
2025-08-31 03:24:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.11e-04
2025-08-31 03:24:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:25:56 - pico-train - INFO - Step 47800 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:25:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8965
2025-08-31 03:25:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.10e-04
2025-08-31 03:25:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:27:07 - pico-train - INFO - Step 47900 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:27:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8890
2025-08-31 03:27:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.10e-04
2025-08-31 03:27:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:28:18 - pico-train - INFO - Step 48000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 03:30:19 - pico-train - INFO - Step 48000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 03:30:19 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 03:30:20 - pico-train - INFO - Step 48000 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:30:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8801
2025-08-31 03:30:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.10e-04
2025-08-31 03:30:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:30:20 - pico-train - INFO - Step 48000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 03:31:33 - pico-train - INFO - Step 48100 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:31:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8751
2025-08-31 03:31:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.09e-04
2025-08-31 03:31:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:32:45 - pico-train - INFO - Step 48200 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:32:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8702
2025-08-31 03:32:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.09e-04
2025-08-31 03:32:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:33:56 - pico-train - INFO - Step 48300 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:33:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8748
2025-08-31 03:33:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.09e-04
2025-08-31 03:33:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:35:07 - pico-train - INFO - Step 48400 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:35:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8780
2025-08-31 03:35:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.08e-04
2025-08-31 03:35:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:36:19 - pico-train - INFO - Step 48500 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:36:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8676
2025-08-31 03:36:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.08e-04
2025-08-31 03:36:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:37:29 - pico-train - INFO - Step 48600 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:37:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8609
2025-08-31 03:37:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.08e-04
2025-08-31 03:37:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:38:41 - pico-train - INFO - Step 48700 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:38:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8671
2025-08-31 03:38:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.07e-04
2025-08-31 03:38:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:39:52 - pico-train - INFO - Step 48800 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:39:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8771
2025-08-31 03:39:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.07e-04
2025-08-31 03:39:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:41:03 - pico-train - INFO - Step 48900 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:41:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8739
2025-08-31 03:41:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.07e-04
2025-08-31 03:41:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:42:13 - pico-train - INFO - Step 49000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 03:44:16 - pico-train - INFO - Step 49000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 03:44:16 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 03:44:17 - pico-train - INFO - Step 49000 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:44:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8755
2025-08-31 03:44:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.06e-04
2025-08-31 03:44:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:44:17 - pico-train - INFO - Step 49000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 03:45:30 - pico-train - INFO - Step 49100 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:45:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8520
2025-08-31 03:45:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.06e-04
2025-08-31 03:45:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:46:41 - pico-train - INFO - Step 49200 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:46:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8698
2025-08-31 03:46:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.06e-04
2025-08-31 03:46:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:47:53 - pico-train - INFO - Step 49300 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:47:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8818
2025-08-31 03:47:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.05e-04
2025-08-31 03:47:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:49:03 - pico-train - INFO - Step 49400 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:49:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8478
2025-08-31 03:49:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.05e-04
2025-08-31 03:49:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:50:14 - pico-train - INFO - Step 49500 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:50:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8803
2025-08-31 03:50:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.05e-04
2025-08-31 03:50:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:51:25 - pico-train - INFO - Step 49600 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:51:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8609
2025-08-31 03:51:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.04e-04
2025-08-31 03:51:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:52:36 - pico-train - INFO - Step 49700 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:52:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8419
2025-08-31 03:52:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.04e-04
2025-08-31 03:52:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:53:47 - pico-train - INFO - Step 49800 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:53:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8767
2025-08-31 03:53:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.04e-04
2025-08-31 03:53:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:54:58 - pico-train - INFO - Step 49900 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:54:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8595
2025-08-31 03:54:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.04e-04
2025-08-31 03:54:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:56:09 - pico-train - INFO - Step 50000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 03:58:10 - pico-train - INFO - Step 50000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 03:58:10 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 03:58:11 - pico-train - INFO - Step 50000 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:58:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8787
2025-08-31 03:58:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.03e-04
2025-08-31 03:58:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 03:58:11 - pico-train - INFO - Step 50000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 03:59:25 - pico-train - INFO - Step 50100 -- ๐Ÿ”„ Training Metrics
2025-08-31 03:59:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8819
2025-08-31 03:59:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.03e-04
2025-08-31 03:59:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:00:36 - pico-train - INFO - Step 50200 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:00:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8675
2025-08-31 04:00:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.03e-04
2025-08-31 04:00:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:01:47 - pico-train - INFO - Step 50300 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:01:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8522
2025-08-31 04:01:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.02e-04
2025-08-31 04:01:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:02:58 - pico-train - INFO - Step 50400 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:02:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8488
2025-08-31 04:02:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.02e-04
2025-08-31 04:02:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:04:09 - pico-train - INFO - Step 50500 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:04:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8620
2025-08-31 04:04:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.02e-04
2025-08-31 04:04:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:05:20 - pico-train - INFO - Step 50600 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:05:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8662
2025-08-31 04:05:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.01e-04
2025-08-31 04:05:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:06:31 - pico-train - INFO - Step 50700 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:06:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8596
2025-08-31 04:06:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.01e-04
2025-08-31 04:06:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:07:42 - pico-train - INFO - Step 50800 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:07:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8543
2025-08-31 04:07:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.01e-04
2025-08-31 04:07:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:08:53 - pico-train - INFO - Step 50900 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:08:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8712
2025-08-31 04:08:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.00e-04
2025-08-31 04:08:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:10:03 - pico-train - INFO - Step 51000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 04:12:05 - pico-train - INFO - Step 51000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 04:12:05 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 04:12:05 - pico-train - INFO - Step 51000 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:12:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8637
2025-08-31 04:12:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.00e-04
2025-08-31 04:12:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:12:05 - pico-train - INFO - Step 51000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 04:13:20 - pico-train - INFO - Step 51100 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:13:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8526
2025-08-31 04:13:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.97e-05
2025-08-31 04:13:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:14:31 - pico-train - INFO - Step 51200 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:14:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8603
2025-08-31 04:14:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.94e-05
2025-08-31 04:14:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:15:42 - pico-train - INFO - Step 51300 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:15:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8577
2025-08-31 04:15:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.90e-05
2025-08-31 04:15:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:16:53 - pico-train - INFO - Step 51400 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:16:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8589
2025-08-31 04:16:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.87e-05
2025-08-31 04:16:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:18:05 - pico-train - INFO - Step 51500 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:18:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8715
2025-08-31 04:18:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.84e-05
2025-08-31 04:18:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:19:16 - pico-train - INFO - Step 51600 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:19:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8822
2025-08-31 04:19:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.81e-05
2025-08-31 04:19:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:20:27 - pico-train - INFO - Step 51700 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:20:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8642
2025-08-31 04:20:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.78e-05
2025-08-31 04:20:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:21:37 - pico-train - INFO - Step 51800 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:21:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8631
2025-08-31 04:21:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.74e-05
2025-08-31 04:21:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:22:48 - pico-train - INFO - Step 51900 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:22:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8383
2025-08-31 04:22:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.71e-05
2025-08-31 04:22:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:23:59 - pico-train - INFO - Step 52000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 04:26:01 - pico-train - INFO - Step 52000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 04:26:01 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 04:26:01 - pico-train - INFO - Step 52000 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:26:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8495
2025-08-31 04:26:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.68e-05
2025-08-31 04:26:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:26:01 - pico-train - INFO - Step 52000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 04:27:15 - pico-train - INFO - Step 52100 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:27:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8515
2025-08-31 04:27:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.65e-05
2025-08-31 04:27:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:28:27 - pico-train - INFO - Step 52200 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:28:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8600
2025-08-31 04:28:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.62e-05
2025-08-31 04:28:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:29:37 - pico-train - INFO - Step 52300 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:29:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8696
2025-08-31 04:29:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.58e-05
2025-08-31 04:29:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:30:48 - pico-train - INFO - Step 52400 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:30:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8359
2025-08-31 04:30:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.55e-05
2025-08-31 04:30:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:32:00 - pico-train - INFO - Step 52500 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:32:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8445
2025-08-31 04:32:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.52e-05
2025-08-31 04:32:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:33:11 - pico-train - INFO - Step 52600 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:33:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8626
2025-08-31 04:33:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.49e-05
2025-08-31 04:33:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:34:22 - pico-train - INFO - Step 52700 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:34:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8556
2025-08-31 04:34:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.46e-05
2025-08-31 04:34:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:35:33 - pico-train - INFO - Step 52800 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:35:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8360
2025-08-31 04:35:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.42e-05
2025-08-31 04:35:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:36:44 - pico-train - INFO - Step 52900 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:36:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8517
2025-08-31 04:36:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.39e-05
2025-08-31 04:36:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:37:55 - pico-train - INFO - Step 53000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 04:39:57 - pico-train - INFO - Step 53000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 04:39:57 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 04:39:58 - pico-train - INFO - Step 53000 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:39:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8507
2025-08-31 04:39:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.36e-05
2025-08-31 04:39:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:39:58 - pico-train - INFO - Step 53000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 04:41:11 - pico-train - INFO - Step 53100 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:41:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8585
2025-08-31 04:41:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.33e-05
2025-08-31 04:41:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:42:22 - pico-train - INFO - Step 53200 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:42:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8520
2025-08-31 04:42:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.30e-05
2025-08-31 04:42:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:43:33 - pico-train - INFO - Step 53300 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:43:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8463
2025-08-31 04:43:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.26e-05
2025-08-31 04:43:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:44:45 - pico-train - INFO - Step 53400 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:44:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8443
2025-08-31 04:44:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.23e-05
2025-08-31 04:44:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:45:56 - pico-train - INFO - Step 53500 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:45:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8567
2025-08-31 04:45:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.20e-05
2025-08-31 04:45:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:47:07 - pico-train - INFO - Step 53600 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:47:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8257
2025-08-31 04:47:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.17e-05
2025-08-31 04:47:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:48:18 - pico-train - INFO - Step 53700 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:48:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8236
2025-08-31 04:48:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.14e-05
2025-08-31 04:48:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:49:30 - pico-train - INFO - Step 53800 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:49:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8331
2025-08-31 04:49:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.10e-05
2025-08-31 04:49:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:50:41 - pico-train - INFO - Step 53900 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:50:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8663
2025-08-31 04:50:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.07e-05
2025-08-31 04:50:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:51:51 - pico-train - INFO - Step 54000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 04:53:52 - pico-train - INFO - Step 54000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 04:53:52 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 04:53:53 - pico-train - INFO - Step 54000 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:53:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8621
2025-08-31 04:53:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.04e-05
2025-08-31 04:53:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:53:53 - pico-train - INFO - Step 54000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 04:55:07 - pico-train - INFO - Step 54100 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:55:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8374
2025-08-31 04:55:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.01e-05
2025-08-31 04:55:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:56:18 - pico-train - INFO - Step 54200 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:56:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8493
2025-08-31 04:56:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.98e-05
2025-08-31 04:56:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:57:29 - pico-train - INFO - Step 54300 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:57:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8266
2025-08-31 04:57:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.94e-05
2025-08-31 04:57:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:58:40 - pico-train - INFO - Step 54400 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:58:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8625
2025-08-31 04:58:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.91e-05
2025-08-31 04:58:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 04:59:51 - pico-train - INFO - Step 54500 -- ๐Ÿ”„ Training Metrics
2025-08-31 04:59:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8460
2025-08-31 04:59:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.88e-05
2025-08-31 04:59:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:01:02 - pico-train - INFO - Step 54600 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:01:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8332
2025-08-31 05:01:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.85e-05
2025-08-31 05:01:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:02:13 - pico-train - INFO - Step 54700 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:02:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8508
2025-08-31 05:02:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.82e-05
2025-08-31 05:02:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:03:24 - pico-train - INFO - Step 54800 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:03:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8588
2025-08-31 05:03:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.78e-05
2025-08-31 05:03:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:04:36 - pico-train - INFO - Step 54900 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:04:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8781
2025-08-31 05:04:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.75e-05
2025-08-31 05:04:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:05:45 - pico-train - INFO - Step 55000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 05:07:47 - pico-train - INFO - Step 55000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 05:07:47 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 05:07:48 - pico-train - INFO - Step 55000 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:07:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8366
2025-08-31 05:07:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.72e-05
2025-08-31 05:07:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:07:48 - pico-train - INFO - Step 55000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 05:09:02 - pico-train - INFO - Step 55100 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:09:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8395
2025-08-31 05:09:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.69e-05
2025-08-31 05:09:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:10:13 - pico-train - INFO - Step 55200 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:10:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8307
2025-08-31 05:10:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.66e-05
2025-08-31 05:10:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:11:24 - pico-train - INFO - Step 55300 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:11:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8421
2025-08-31 05:11:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.63e-05
2025-08-31 05:11:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:12:35 - pico-train - INFO - Step 55400 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:12:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8537
2025-08-31 05:12:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.59e-05
2025-08-31 05:12:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:13:46 - pico-train - INFO - Step 55500 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:13:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8316
2025-08-31 05:13:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.56e-05
2025-08-31 05:13:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:14:57 - pico-train - INFO - Step 55600 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:14:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8330
2025-08-31 05:14:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.53e-05
2025-08-31 05:14:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:16:08 - pico-train - INFO - Step 55700 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:16:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8138
2025-08-31 05:16:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.50e-05
2025-08-31 05:16:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:17:19 - pico-train - INFO - Step 55800 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:17:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8473
2025-08-31 05:17:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.47e-05
2025-08-31 05:17:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:18:29 - pico-train - INFO - Step 55900 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:18:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8470
2025-08-31 05:18:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.44e-05
2025-08-31 05:18:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:19:40 - pico-train - INFO - Step 56000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 05:21:41 - pico-train - INFO - Step 56000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 05:21:41 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 05:21:42 - pico-train - INFO - Step 56000 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:21:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8561
2025-08-31 05:21:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.40e-05
2025-08-31 05:21:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:21:42 - pico-train - INFO - Step 56000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 05:22:56 - pico-train - INFO - Step 56100 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:22:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8491
2025-08-31 05:22:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.37e-05
2025-08-31 05:22:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:24:07 - pico-train - INFO - Step 56200 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:24:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8458
2025-08-31 05:24:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.34e-05
2025-08-31 05:24:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:25:18 - pico-train - INFO - Step 56300 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:25:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8346
2025-08-31 05:25:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.31e-05
2025-08-31 05:25:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:26:30 - pico-train - INFO - Step 56400 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:26:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8510
2025-08-31 05:26:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.28e-05
2025-08-31 05:26:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:27:41 - pico-train - INFO - Step 56500 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:27:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8172
2025-08-31 05:27:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.25e-05
2025-08-31 05:27:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:28:52 - pico-train - INFO - Step 56600 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:28:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8297
2025-08-31 05:28:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.21e-05
2025-08-31 05:28:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:30:03 - pico-train - INFO - Step 56700 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:30:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8400
2025-08-31 05:30:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.18e-05
2025-08-31 05:30:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:31:14 - pico-train - INFO - Step 56800 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:31:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8370
2025-08-31 05:31:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.15e-05
2025-08-31 05:31:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:32:25 - pico-train - INFO - Step 56900 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:32:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8458
2025-08-31 05:32:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.12e-05
2025-08-31 05:32:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:33:35 - pico-train - INFO - Step 57000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 05:35:36 - pico-train - INFO - Step 57000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 05:35:36 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 05:35:37 - pico-train - INFO - Step 57000 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:35:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8467
2025-08-31 05:35:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.09e-05
2025-08-31 05:35:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:35:37 - pico-train - INFO - Step 57000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 05:36:51 - pico-train - INFO - Step 57100 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:36:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8173
2025-08-31 05:36:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.06e-05
2025-08-31 05:36:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:38:02 - pico-train - INFO - Step 57200 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:38:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8302
2025-08-31 05:38:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.03e-05
2025-08-31 05:38:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:39:12 - pico-train - INFO - Step 57300 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:39:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8261
2025-08-31 05:39:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.99e-05
2025-08-31 05:39:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:40:23 - pico-train - INFO - Step 57400 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:40:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8266
2025-08-31 05:40:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.96e-05
2025-08-31 05:40:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:41:34 - pico-train - INFO - Step 57500 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:41:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8414
2025-08-31 05:41:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.93e-05
2025-08-31 05:41:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:42:46 - pico-train - INFO - Step 57600 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:42:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8351
2025-08-31 05:42:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.90e-05
2025-08-31 05:42:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:43:57 - pico-train - INFO - Step 57700 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:43:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8598
2025-08-31 05:43:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.87e-05
2025-08-31 05:43:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:45:08 - pico-train - INFO - Step 57800 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:45:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8310
2025-08-31 05:45:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.84e-05
2025-08-31 05:45:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:46:19 - pico-train - INFO - Step 57900 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:46:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8333
2025-08-31 05:46:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.81e-05
2025-08-31 05:46:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:47:29 - pico-train - INFO - Step 58000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 05:49:31 - pico-train - INFO - Step 58000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 05:49:31 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 05:49:31 - pico-train - INFO - Step 58000 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:49:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8291
2025-08-31 05:49:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.77e-05
2025-08-31 05:49:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:49:31 - pico-train - INFO - Step 58000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 05:50:45 - pico-train - INFO - Step 58100 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:50:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8302
2025-08-31 05:50:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.74e-05
2025-08-31 05:50:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:51:56 - pico-train - INFO - Step 58200 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:51:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8194
2025-08-31 05:51:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.71e-05
2025-08-31 05:51:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:53:07 - pico-train - INFO - Step 58300 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:53:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8363
2025-08-31 05:53:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.68e-05
2025-08-31 05:53:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:54:18 - pico-train - INFO - Step 58400 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:54:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8375
2025-08-31 05:54:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.65e-05
2025-08-31 05:54:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:55:29 - pico-train - INFO - Step 58500 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:55:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8184
2025-08-31 05:55:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.62e-05
2025-08-31 05:55:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:56:40 - pico-train - INFO - Step 58600 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:56:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8258
2025-08-31 05:56:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.59e-05
2025-08-31 05:56:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:57:51 - pico-train - INFO - Step 58700 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:57:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8396
2025-08-31 05:57:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.56e-05
2025-08-31 05:57:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 05:59:02 - pico-train - INFO - Step 58800 -- ๐Ÿ”„ Training Metrics
2025-08-31 05:59:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8104
2025-08-31 05:59:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.53e-05
2025-08-31 05:59:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:00:13 - pico-train - INFO - Step 58900 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:00:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8457
2025-08-31 06:00:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.49e-05
2025-08-31 06:00:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:01:23 - pico-train - INFO - Step 59000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 06:03:25 - pico-train - INFO - Step 59000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 06:03:25 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 06:03:26 - pico-train - INFO - Step 59000 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:03:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8381
2025-08-31 06:03:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.46e-05
2025-08-31 06:03:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:03:26 - pico-train - INFO - Step 59000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 06:04:40 - pico-train - INFO - Step 59100 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:04:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8267
2025-08-31 06:04:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.43e-05
2025-08-31 06:04:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:05:50 - pico-train - INFO - Step 59200 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:05:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8410
2025-08-31 06:05:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.40e-05
2025-08-31 06:05:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:07:01 - pico-train - INFO - Step 59300 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:07:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8489
2025-08-31 06:07:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.37e-05
2025-08-31 06:07:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:08:13 - pico-train - INFO - Step 59400 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:08:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8226
2025-08-31 06:08:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.34e-05
2025-08-31 06:08:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:09:23 - pico-train - INFO - Step 59500 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:09:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8042
2025-08-31 06:09:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.31e-05
2025-08-31 06:09:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:10:35 - pico-train - INFO - Step 59600 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:10:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8441
2025-08-31 06:10:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.28e-05
2025-08-31 06:10:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:11:46 - pico-train - INFO - Step 59700 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:11:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8589
2025-08-31 06:11:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.25e-05
2025-08-31 06:11:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:12:57 - pico-train - INFO - Step 59800 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:12:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8339
2025-08-31 06:12:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.22e-05
2025-08-31 06:12:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:14:08 - pico-train - INFO - Step 59900 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:14:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8215
2025-08-31 06:14:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.19e-05
2025-08-31 06:14:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:15:18 - pico-train - INFO - Step 60000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 06:17:20 - pico-train - INFO - Step 60000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 06:17:20 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 06:17:21 - pico-train - INFO - Step 60000 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:17:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8345
2025-08-31 06:17:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.15e-05
2025-08-31 06:17:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:17:21 - pico-train - INFO - Step 60000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 06:18:34 - pico-train - INFO - Step 60100 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:18:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8207
2025-08-31 06:18:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.12e-05
2025-08-31 06:18:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:19:46 - pico-train - INFO - Step 60200 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:19:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8181
2025-08-31 06:19:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.09e-05
2025-08-31 06:19:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:20:57 - pico-train - INFO - Step 60300 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:20:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8059
2025-08-31 06:20:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.06e-05
2025-08-31 06:20:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:22:08 - pico-train - INFO - Step 60400 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:22:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8367
2025-08-31 06:22:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.03e-05
2025-08-31 06:22:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:23:19 - pico-train - INFO - Step 60500 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:23:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8237
2025-08-31 06:23:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 7.00e-05
2025-08-31 06:23:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:24:31 - pico-train - INFO - Step 60600 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:24:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8291
2025-08-31 06:24:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.97e-05
2025-08-31 06:24:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:25:42 - pico-train - INFO - Step 60700 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:25:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8317
2025-08-31 06:25:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.94e-05
2025-08-31 06:25:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:26:53 - pico-train - INFO - Step 60800 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:26:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8204
2025-08-31 06:26:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.91e-05
2025-08-31 06:26:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:28:03 - pico-train - INFO - Step 60900 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:28:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8455
2025-08-31 06:28:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.88e-05
2025-08-31 06:28:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:29:14 - pico-train - INFO - Step 61000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 06:31:15 - pico-train - INFO - Step 61000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 06:31:15 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 06:31:16 - pico-train - INFO - Step 61000 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:31:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8133
2025-08-31 06:31:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.85e-05
2025-08-31 06:31:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:31:16 - pico-train - INFO - Step 61000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 06:32:30 - pico-train - INFO - Step 61100 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:32:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8156
2025-08-31 06:32:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.82e-05
2025-08-31 06:32:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:33:41 - pico-train - INFO - Step 61200 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:33:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8153
2025-08-31 06:33:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.79e-05
2025-08-31 06:33:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:34:53 - pico-train - INFO - Step 61300 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:34:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8111
2025-08-31 06:34:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.76e-05
2025-08-31 06:34:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:36:03 - pico-train - INFO - Step 61400 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:36:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8220
2025-08-31 06:36:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.73e-05
2025-08-31 06:36:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:37:15 - pico-train - INFO - Step 61500 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:37:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8183
2025-08-31 06:37:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.70e-05
2025-08-31 06:37:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:38:26 - pico-train - INFO - Step 61600 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:38:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8133
2025-08-31 06:38:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.67e-05
2025-08-31 06:38:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:39:37 - pico-train - INFO - Step 61700 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:39:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8243
2025-08-31 06:39:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.64e-05
2025-08-31 06:39:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:40:48 - pico-train - INFO - Step 61800 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:40:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8117
2025-08-31 06:40:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.61e-05
2025-08-31 06:40:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:41:58 - pico-train - INFO - Step 61900 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:41:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8329
2025-08-31 06:41:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.58e-05
2025-08-31 06:41:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:43:09 - pico-train - INFO - Step 62000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 06:45:10 - pico-train - INFO - Step 62000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 06:45:10 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 06:45:11 - pico-train - INFO - Step 62000 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:45:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8042
2025-08-31 06:45:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.55e-05
2025-08-31 06:45:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:45:11 - pico-train - INFO - Step 62000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 06:46:26 - pico-train - INFO - Step 62100 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:46:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8256
2025-08-31 06:46:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.52e-05
2025-08-31 06:46:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:47:37 - pico-train - INFO - Step 62200 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:47:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8248
2025-08-31 06:47:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.49e-05
2025-08-31 06:47:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:48:47 - pico-train - INFO - Step 62300 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:48:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8133
2025-08-31 06:48:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.46e-05
2025-08-31 06:48:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:49:58 - pico-train - INFO - Step 62400 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:49:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8239
2025-08-31 06:49:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.43e-05
2025-08-31 06:49:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:51:09 - pico-train - INFO - Step 62500 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:51:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8295
2025-08-31 06:51:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.40e-05
2025-08-31 06:51:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:52:20 - pico-train - INFO - Step 62600 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:52:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8202
2025-08-31 06:52:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.37e-05
2025-08-31 06:52:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:53:32 - pico-train - INFO - Step 62700 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:53:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7956
2025-08-31 06:53:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.34e-05
2025-08-31 06:53:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:54:43 - pico-train - INFO - Step 62800 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:54:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8361
2025-08-31 06:54:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.31e-05
2025-08-31 06:54:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:55:54 - pico-train - INFO - Step 62900 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:55:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8381
2025-08-31 06:55:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.28e-05
2025-08-31 06:55:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:57:05 - pico-train - INFO - Step 63000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 06:59:06 - pico-train - INFO - Step 63000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 06:59:06 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 06:59:07 - pico-train - INFO - Step 63000 -- ๐Ÿ”„ Training Metrics
2025-08-31 06:59:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8256
2025-08-31 06:59:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.25e-05
2025-08-31 06:59:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 06:59:07 - pico-train - INFO - Step 63000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 07:00:20 - pico-train - INFO - Step 63100 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:00:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8417
2025-08-31 07:00:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.22e-05
2025-08-31 07:00:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:01:31 - pico-train - INFO - Step 63200 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:01:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8099
2025-08-31 07:01:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.19e-05
2025-08-31 07:01:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:02:42 - pico-train - INFO - Step 63300 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:02:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8293
2025-08-31 07:02:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.16e-05
2025-08-31 07:02:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:03:53 - pico-train - INFO - Step 63400 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:03:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8126
2025-08-31 07:03:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.13e-05
2025-08-31 07:03:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:05:04 - pico-train - INFO - Step 63500 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:05:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8183
2025-08-31 07:05:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.10e-05
2025-08-31 07:05:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:06:15 - pico-train - INFO - Step 63600 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:06:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8345
2025-08-31 07:06:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.07e-05
2025-08-31 07:06:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:07:26 - pico-train - INFO - Step 63700 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:07:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8124
2025-08-31 07:07:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.04e-05
2025-08-31 07:07:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:08:39 - pico-train - INFO - Step 63800 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:08:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8222
2025-08-31 07:08:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 6.01e-05
2025-08-31 07:08:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:09:50 - pico-train - INFO - Step 63900 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:09:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8265
2025-08-31 07:09:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.98e-05
2025-08-31 07:09:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:11:00 - pico-train - INFO - Step 64000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 07:13:02 - pico-train - INFO - Step 64000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 07:13:02 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 07:13:04 - pico-train - INFO - Step 64000 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:13:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7990
2025-08-31 07:13:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.95e-05
2025-08-31 07:13:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:13:04 - pico-train - INFO - Step 64000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 07:14:17 - pico-train - INFO - Step 64100 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:14:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8111
2025-08-31 07:14:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.92e-05
2025-08-31 07:14:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:15:28 - pico-train - INFO - Step 64200 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:15:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7969
2025-08-31 07:15:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.89e-05
2025-08-31 07:15:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:16:39 - pico-train - INFO - Step 64300 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:16:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8199
2025-08-31 07:16:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.86e-05
2025-08-31 07:16:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:17:50 - pico-train - INFO - Step 64400 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:17:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8352
2025-08-31 07:17:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.84e-05
2025-08-31 07:17:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:19:01 - pico-train - INFO - Step 64500 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:19:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8160
2025-08-31 07:19:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.81e-05
2025-08-31 07:19:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:20:12 - pico-train - INFO - Step 64600 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:20:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8396
2025-08-31 07:20:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.78e-05
2025-08-31 07:20:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:21:23 - pico-train - INFO - Step 64700 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:21:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8071
2025-08-31 07:21:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.75e-05
2025-08-31 07:21:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:22:34 - pico-train - INFO - Step 64800 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:22:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8250
2025-08-31 07:22:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.72e-05
2025-08-31 07:22:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:23:45 - pico-train - INFO - Step 64900 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:23:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8292
2025-08-31 07:23:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.69e-05
2025-08-31 07:23:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:24:55 - pico-train - INFO - Step 65000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 07:26:57 - pico-train - INFO - Step 65000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 07:26:57 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 07:26:58 - pico-train - INFO - Step 65000 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:26:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8135
2025-08-31 07:26:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.66e-05
2025-08-31 07:26:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:26:58 - pico-train - INFO - Step 65000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 07:28:12 - pico-train - INFO - Step 65100 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:28:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7973
2025-08-31 07:28:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.63e-05
2025-08-31 07:28:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:29:23 - pico-train - INFO - Step 65200 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:29:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7910
2025-08-31 07:29:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.60e-05
2025-08-31 07:29:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:30:35 - pico-train - INFO - Step 65300 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:30:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8193
2025-08-31 07:30:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.57e-05
2025-08-31 07:30:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:31:45 - pico-train - INFO - Step 65400 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:31:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8241
2025-08-31 07:31:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.55e-05
2025-08-31 07:31:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:32:57 - pico-train - INFO - Step 65500 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:32:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8292
2025-08-31 07:32:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.52e-05
2025-08-31 07:32:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:34:08 - pico-train - INFO - Step 65600 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:34:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8186
2025-08-31 07:34:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.49e-05
2025-08-31 07:34:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:35:19 - pico-train - INFO - Step 65700 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:35:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8085
2025-08-31 07:35:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.46e-05
2025-08-31 07:35:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:36:30 - pico-train - INFO - Step 65800 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:36:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8058
2025-08-31 07:36:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.43e-05
2025-08-31 07:36:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:37:41 - pico-train - INFO - Step 65900 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:37:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7922
2025-08-31 07:37:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.40e-05
2025-08-31 07:37:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:38:51 - pico-train - INFO - Step 66000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 07:40:53 - pico-train - INFO - Step 66000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 07:40:53 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 07:40:53 - pico-train - INFO - Step 66000 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:40:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8014
2025-08-31 07:40:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.37e-05
2025-08-31 07:40:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:40:53 - pico-train - INFO - Step 66000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 07:42:07 - pico-train - INFO - Step 66100 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:42:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8063
2025-08-31 07:42:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.35e-05
2025-08-31 07:42:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:43:18 - pico-train - INFO - Step 66200 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:43:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8139
2025-08-31 07:43:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.32e-05
2025-08-31 07:43:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:44:30 - pico-train - INFO - Step 66300 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:44:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8191
2025-08-31 07:44:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.29e-05
2025-08-31 07:44:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:45:40 - pico-train - INFO - Step 66400 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:45:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7988
2025-08-31 07:45:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.26e-05
2025-08-31 07:45:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:46:52 - pico-train - INFO - Step 66500 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:46:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7927
2025-08-31 07:46:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.23e-05
2025-08-31 07:46:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:48:04 - pico-train - INFO - Step 66600 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:48:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8113
2025-08-31 07:48:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.20e-05
2025-08-31 07:48:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:49:15 - pico-train - INFO - Step 66700 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:49:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7948
2025-08-31 07:49:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.18e-05
2025-08-31 07:49:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:50:25 - pico-train - INFO - Step 66800 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:50:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8223
2025-08-31 07:50:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.15e-05
2025-08-31 07:50:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:51:37 - pico-train - INFO - Step 66900 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:51:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7917
2025-08-31 07:51:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.12e-05
2025-08-31 07:51:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:52:47 - pico-train - INFO - Step 67000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 07:54:49 - pico-train - INFO - Step 67000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 07:54:49 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 07:54:50 - pico-train - INFO - Step 67000 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:54:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8179
2025-08-31 07:54:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.09e-05
2025-08-31 07:54:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:54:50 - pico-train - INFO - Step 67000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 07:56:03 - pico-train - INFO - Step 67100 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:56:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8136
2025-08-31 07:56:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.06e-05
2025-08-31 07:56:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:57:14 - pico-train - INFO - Step 67200 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:57:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7937
2025-08-31 07:57:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.04e-05
2025-08-31 07:57:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:58:25 - pico-train - INFO - Step 67300 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:58:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7952
2025-08-31 07:58:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.01e-05
2025-08-31 07:58:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 07:59:36 - pico-train - INFO - Step 67400 -- ๐Ÿ”„ Training Metrics
2025-08-31 07:59:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7929
2025-08-31 07:59:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.98e-05
2025-08-31 07:59:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:00:47 - pico-train - INFO - Step 67500 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:00:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8137
2025-08-31 08:00:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.95e-05
2025-08-31 08:00:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:01:58 - pico-train - INFO - Step 67600 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:01:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7972
2025-08-31 08:01:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.93e-05
2025-08-31 08:01:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:03:10 - pico-train - INFO - Step 67700 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:03:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8047
2025-08-31 08:03:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.90e-05
2025-08-31 08:03:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:04:20 - pico-train - INFO - Step 67800 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:04:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8156
2025-08-31 08:04:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.87e-05
2025-08-31 08:04:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:05:32 - pico-train - INFO - Step 67900 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:05:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7833
2025-08-31 08:05:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.84e-05
2025-08-31 08:05:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:06:43 - pico-train - INFO - Step 68000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 08:08:45 - pico-train - INFO - Step 68000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 08:08:45 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 08:08:45 - pico-train - INFO - Step 68000 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:08:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8150
2025-08-31 08:08:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.82e-05
2025-08-31 08:08:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:08:45 - pico-train - INFO - Step 68000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 08:09:59 - pico-train - INFO - Step 68100 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:09:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8111
2025-08-31 08:09:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.79e-05
2025-08-31 08:09:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:11:10 - pico-train - INFO - Step 68200 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:11:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8168
2025-08-31 08:11:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.76e-05
2025-08-31 08:11:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:12:21 - pico-train - INFO - Step 68300 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:12:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8166
2025-08-31 08:12:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.73e-05
2025-08-31 08:12:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:13:32 - pico-train - INFO - Step 68400 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:13:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8310
2025-08-31 08:13:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.71e-05
2025-08-31 08:13:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:14:44 - pico-train - INFO - Step 68500 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:14:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8159
2025-08-31 08:14:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.68e-05
2025-08-31 08:14:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:15:55 - pico-train - INFO - Step 68600 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:15:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7912
2025-08-31 08:15:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.65e-05
2025-08-31 08:15:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:17:05 - pico-train - INFO - Step 68700 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:17:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7638
2025-08-31 08:17:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.63e-05
2025-08-31 08:17:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:18:17 - pico-train - INFO - Step 68800 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:18:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8171
2025-08-31 08:18:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.60e-05
2025-08-31 08:18:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:19:28 - pico-train - INFO - Step 68900 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:19:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7935
2025-08-31 08:19:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.57e-05
2025-08-31 08:19:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:20:39 - pico-train - INFO - Step 69000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 08:22:40 - pico-train - INFO - Step 69000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 08:22:40 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 08:22:41 - pico-train - INFO - Step 69000 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:22:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8098
2025-08-31 08:22:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.54e-05
2025-08-31 08:22:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:22:41 - pico-train - INFO - Step 69000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 08:23:55 - pico-train - INFO - Step 69100 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:23:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8127
2025-08-31 08:23:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.52e-05
2025-08-31 08:23:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:25:06 - pico-train - INFO - Step 69200 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:25:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8256
2025-08-31 08:25:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.49e-05
2025-08-31 08:25:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:26:17 - pico-train - INFO - Step 69300 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:26:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8021
2025-08-31 08:26:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.46e-05
2025-08-31 08:26:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:27:28 - pico-train - INFO - Step 69400 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:27:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8068
2025-08-31 08:27:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.44e-05
2025-08-31 08:27:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:28:40 - pico-train - INFO - Step 69500 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:28:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7974
2025-08-31 08:28:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.41e-05
2025-08-31 08:28:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:29:50 - pico-train - INFO - Step 69600 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:29:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8162
2025-08-31 08:29:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.38e-05
2025-08-31 08:29:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:31:02 - pico-train - INFO - Step 69700 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:31:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8043
2025-08-31 08:31:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.36e-05
2025-08-31 08:31:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:32:13 - pico-train - INFO - Step 69800 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:32:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7863
2025-08-31 08:32:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.33e-05
2025-08-31 08:32:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:33:24 - pico-train - INFO - Step 69900 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:33:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8059
2025-08-31 08:33:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.31e-05
2025-08-31 08:33:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:34:34 - pico-train - INFO - Step 70000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 08:36:36 - pico-train - INFO - Step 70000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 08:36:36 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 08:36:37 - pico-train - INFO - Step 70000 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:36:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7928
2025-08-31 08:36:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.28e-05
2025-08-31 08:36:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:36:37 - pico-train - INFO - Step 70000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 08:37:50 - pico-train - INFO - Step 70100 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:37:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8159
2025-08-31 08:37:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.25e-05
2025-08-31 08:37:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:39:01 - pico-train - INFO - Step 70200 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:39:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8087
2025-08-31 08:39:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.23e-05
2025-08-31 08:39:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:40:13 - pico-train - INFO - Step 70300 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:40:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8081
2025-08-31 08:40:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.20e-05
2025-08-31 08:40:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:41:24 - pico-train - INFO - Step 70400 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:41:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7848
2025-08-31 08:41:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.17e-05
2025-08-31 08:41:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:42:35 - pico-train - INFO - Step 70500 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:42:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7941
2025-08-31 08:42:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.15e-05
2025-08-31 08:42:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:43:46 - pico-train - INFO - Step 70600 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:43:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7965
2025-08-31 08:43:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.12e-05
2025-08-31 08:43:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:44:57 - pico-train - INFO - Step 70700 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:44:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7734
2025-08-31 08:44:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.10e-05
2025-08-31 08:44:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:46:08 - pico-train - INFO - Step 70800 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:46:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8101
2025-08-31 08:46:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.07e-05
2025-08-31 08:46:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:47:19 - pico-train - INFO - Step 70900 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:47:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8094
2025-08-31 08:47:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.04e-05
2025-08-31 08:47:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:48:29 - pico-train - INFO - Step 71000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 08:50:31 - pico-train - INFO - Step 71000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 08:50:31 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 08:50:32 - pico-train - INFO - Step 71000 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:50:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8045
2025-08-31 08:50:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 4.02e-05
2025-08-31 08:50:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:50:32 - pico-train - INFO - Step 71000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 08:51:46 - pico-train - INFO - Step 71100 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:51:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7922
2025-08-31 08:51:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.99e-05
2025-08-31 08:51:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:52:57 - pico-train - INFO - Step 71200 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:52:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8090
2025-08-31 08:52:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.97e-05
2025-08-31 08:52:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:54:08 - pico-train - INFO - Step 71300 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:54:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8051
2025-08-31 08:54:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.94e-05
2025-08-31 08:54:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:55:19 - pico-train - INFO - Step 71400 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:55:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7895
2025-08-31 08:55:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.92e-05
2025-08-31 08:55:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:56:30 - pico-train - INFO - Step 71500 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:56:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8153
2025-08-31 08:56:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.89e-05
2025-08-31 08:56:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:57:41 - pico-train - INFO - Step 71600 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:57:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8103
2025-08-31 08:57:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.87e-05
2025-08-31 08:57:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 08:58:53 - pico-train - INFO - Step 71700 -- ๐Ÿ”„ Training Metrics
2025-08-31 08:58:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7847
2025-08-31 08:58:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.84e-05
2025-08-31 08:58:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:00:04 - pico-train - INFO - Step 71800 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:00:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7882
2025-08-31 09:00:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.82e-05
2025-08-31 09:00:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:01:15 - pico-train - INFO - Step 71900 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:01:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8031
2025-08-31 09:01:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.79e-05
2025-08-31 09:01:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:02:25 - pico-train - INFO - Step 72000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 09:04:27 - pico-train - INFO - Step 72000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 09:04:27 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 09:04:27 - pico-train - INFO - Step 72000 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:04:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8049
2025-08-31 09:04:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.77e-05
2025-08-31 09:04:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:04:27 - pico-train - INFO - Step 72000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 09:05:41 - pico-train - INFO - Step 72100 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:05:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7998
2025-08-31 09:05:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.74e-05
2025-08-31 09:05:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:06:52 - pico-train - INFO - Step 72200 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:06:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8082
2025-08-31 09:06:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.72e-05
2025-08-31 09:06:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:08:03 - pico-train - INFO - Step 72300 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:08:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8191
2025-08-31 09:08:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.69e-05
2025-08-31 09:08:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:09:14 - pico-train - INFO - Step 72400 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:09:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7900
2025-08-31 09:09:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.67e-05
2025-08-31 09:09:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:10:25 - pico-train - INFO - Step 72500 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:10:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7962
2025-08-31 09:10:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.64e-05
2025-08-31 09:10:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:11:37 - pico-train - INFO - Step 72600 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:11:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8325
2025-08-31 09:11:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.62e-05
2025-08-31 09:11:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:12:48 - pico-train - INFO - Step 72700 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:12:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7872
2025-08-31 09:12:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.59e-05
2025-08-31 09:12:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:13:59 - pico-train - INFO - Step 72800 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:13:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7991
2025-08-31 09:13:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.57e-05
2025-08-31 09:13:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:15:10 - pico-train - INFO - Step 72900 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:15:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7826
2025-08-31 09:15:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.54e-05
2025-08-31 09:15:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:16:21 - pico-train - INFO - Step 73000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 09:18:23 - pico-train - INFO - Step 73000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 09:18:23 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 09:18:24 - pico-train - INFO - Step 73000 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:18:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7869
2025-08-31 09:18:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.52e-05
2025-08-31 09:18:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:18:24 - pico-train - INFO - Step 73000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 09:19:37 - pico-train - INFO - Step 73100 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:19:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7872
2025-08-31 09:19:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.49e-05
2025-08-31 09:19:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:20:48 - pico-train - INFO - Step 73200 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:20:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7995
2025-08-31 09:20:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.47e-05
2025-08-31 09:20:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:22:00 - pico-train - INFO - Step 73300 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:22:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8165
2025-08-31 09:22:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.44e-05
2025-08-31 09:22:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:23:11 - pico-train - INFO - Step 73400 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:23:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7786
2025-08-31 09:23:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.42e-05
2025-08-31 09:23:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:24:23 - pico-train - INFO - Step 73500 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:24:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7794
2025-08-31 09:24:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.40e-05
2025-08-31 09:24:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:25:34 - pico-train - INFO - Step 73600 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:25:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7911
2025-08-31 09:25:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.37e-05
2025-08-31 09:25:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:26:44 - pico-train - INFO - Step 73700 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:26:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8029
2025-08-31 09:26:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.35e-05
2025-08-31 09:26:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:27:56 - pico-train - INFO - Step 73800 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:27:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7945
2025-08-31 09:27:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.32e-05
2025-08-31 09:27:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:29:07 - pico-train - INFO - Step 73900 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:29:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8115
2025-08-31 09:29:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.30e-05
2025-08-31 09:29:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:30:18 - pico-train - INFO - Step 74000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 09:32:20 - pico-train - INFO - Step 74000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 09:32:20 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 09:32:20 - pico-train - INFO - Step 74000 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:32:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8107
2025-08-31 09:32:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.28e-05
2025-08-31 09:32:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:32:20 - pico-train - INFO - Step 74000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 09:33:34 - pico-train - INFO - Step 74100 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:33:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7847
2025-08-31 09:33:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.25e-05
2025-08-31 09:33:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:34:45 - pico-train - INFO - Step 74200 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:34:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7943
2025-08-31 09:34:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.23e-05
2025-08-31 09:34:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:35:57 - pico-train - INFO - Step 74300 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:35:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7811
2025-08-31 09:35:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.21e-05
2025-08-31 09:35:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:37:08 - pico-train - INFO - Step 74400 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:37:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8010
2025-08-31 09:37:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.18e-05
2025-08-31 09:37:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:38:19 - pico-train - INFO - Step 74500 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:38:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7898
2025-08-31 09:38:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.16e-05
2025-08-31 09:38:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:39:30 - pico-train - INFO - Step 74600 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:39:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8247
2025-08-31 09:39:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.14e-05
2025-08-31 09:39:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:40:41 - pico-train - INFO - Step 74700 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:40:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8034
2025-08-31 09:40:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.11e-05
2025-08-31 09:40:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:41:52 - pico-train - INFO - Step 74800 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:41:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7996
2025-08-31 09:41:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.09e-05
2025-08-31 09:41:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:43:03 - pico-train - INFO - Step 74900 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:43:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7822
2025-08-31 09:43:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.07e-05
2025-08-31 09:43:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:44:14 - pico-train - INFO - Step 75000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 09:46:15 - pico-train - INFO - Step 75000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 09:46:15 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 09:46:16 - pico-train - INFO - Step 75000 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:46:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7939
2025-08-31 09:46:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.04e-05
2025-08-31 09:46:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:46:16 - pico-train - INFO - Step 75000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 09:47:29 - pico-train - INFO - Step 75100 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:47:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7925
2025-08-31 09:47:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.02e-05
2025-08-31 09:47:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:48:41 - pico-train - INFO - Step 75200 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:48:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7906
2025-08-31 09:48:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.00e-05
2025-08-31 09:48:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:49:52 - pico-train - INFO - Step 75300 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:49:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8083
2025-08-31 09:49:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.97e-05
2025-08-31 09:49:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:51:03 - pico-train - INFO - Step 75400 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:51:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7914
2025-08-31 09:51:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.95e-05
2025-08-31 09:51:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:52:14 - pico-train - INFO - Step 75500 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:52:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7965
2025-08-31 09:52:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.93e-05
2025-08-31 09:52:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:53:26 - pico-train - INFO - Step 75600 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:53:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7760
2025-08-31 09:53:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.91e-05
2025-08-31 09:53:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:54:37 - pico-train - INFO - Step 75700 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:54:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7893
2025-08-31 09:54:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.88e-05
2025-08-31 09:54:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:55:48 - pico-train - INFO - Step 75800 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:55:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7945
2025-08-31 09:55:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.86e-05
2025-08-31 09:55:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:56:59 - pico-train - INFO - Step 75900 -- ๐Ÿ”„ Training Metrics
2025-08-31 09:56:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7982
2025-08-31 09:56:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.84e-05
2025-08-31 09:56:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 09:58:09 - pico-train - INFO - Step 76000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 10:00:10 - pico-train - INFO - Step 76000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 10:00:10 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 10:00:11 - pico-train - INFO - Step 76000 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:00:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7743
2025-08-31 10:00:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.82e-05
2025-08-31 10:00:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:00:11 - pico-train - INFO - Step 76000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 10:01:25 - pico-train - INFO - Step 76100 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:01:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7834
2025-08-31 10:01:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.79e-05
2025-08-31 10:01:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:02:37 - pico-train - INFO - Step 76200 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:02:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8036
2025-08-31 10:02:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.77e-05
2025-08-31 10:02:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:03:48 - pico-train - INFO - Step 76300 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:03:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8019
2025-08-31 10:03:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.75e-05
2025-08-31 10:03:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:04:59 - pico-train - INFO - Step 76400 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:04:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7705
2025-08-31 10:04:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.73e-05
2025-08-31 10:04:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:06:10 - pico-train - INFO - Step 76500 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:06:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7851
2025-08-31 10:06:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.71e-05
2025-08-31 10:06:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:07:21 - pico-train - INFO - Step 76600 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:07:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8080
2025-08-31 10:07:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.68e-05
2025-08-31 10:07:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:08:33 - pico-train - INFO - Step 76700 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:08:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7811
2025-08-31 10:08:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.66e-05
2025-08-31 10:08:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:09:45 - pico-train - INFO - Step 76800 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:09:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7781
2025-08-31 10:09:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.64e-05
2025-08-31 10:09:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:10:55 - pico-train - INFO - Step 76900 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:10:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7936
2025-08-31 10:10:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.62e-05
2025-08-31 10:10:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:12:06 - pico-train - INFO - Step 77000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 10:14:08 - pico-train - INFO - Step 77000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 10:14:08 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 10:14:08 - pico-train - INFO - Step 77000 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:14:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7972
2025-08-31 10:14:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.60e-05
2025-08-31 10:14:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:14:08 - pico-train - INFO - Step 77000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 10:15:22 - pico-train - INFO - Step 77100 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:15:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8169
2025-08-31 10:15:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.58e-05
2025-08-31 10:15:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:16:33 - pico-train - INFO - Step 77200 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:16:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7975
2025-08-31 10:16:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.55e-05
2025-08-31 10:16:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:17:44 - pico-train - INFO - Step 77300 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:17:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7711
2025-08-31 10:17:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.53e-05
2025-08-31 10:17:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:18:55 - pico-train - INFO - Step 77400 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:18:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7869
2025-08-31 10:18:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.51e-05
2025-08-31 10:18:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:20:06 - pico-train - INFO - Step 77500 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:20:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8086
2025-08-31 10:20:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.49e-05
2025-08-31 10:20:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:21:18 - pico-train - INFO - Step 77600 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:21:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7981
2025-08-31 10:21:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.47e-05
2025-08-31 10:21:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:22:29 - pico-train - INFO - Step 77700 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:22:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7844
2025-08-31 10:22:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.45e-05
2025-08-31 10:22:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:23:40 - pico-train - INFO - Step 77800 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:23:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7779
2025-08-31 10:23:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.43e-05
2025-08-31 10:23:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:24:51 - pico-train - INFO - Step 77900 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:24:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7901
2025-08-31 10:24:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.41e-05
2025-08-31 10:24:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:26:01 - pico-train - INFO - Step 78000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 10:28:03 - pico-train - INFO - Step 78000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 10:28:03 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 10:28:04 - pico-train - INFO - Step 78000 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:28:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7974
2025-08-31 10:28:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.39e-05
2025-08-31 10:28:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:28:04 - pico-train - INFO - Step 78000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 10:29:19 - pico-train - INFO - Step 78100 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:29:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7623
2025-08-31 10:29:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.36e-05
2025-08-31 10:29:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:30:30 - pico-train - INFO - Step 78200 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:30:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8048
2025-08-31 10:30:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.34e-05
2025-08-31 10:30:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:31:41 - pico-train - INFO - Step 78300 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:31:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7813
2025-08-31 10:31:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.32e-05
2025-08-31 10:31:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:32:52 - pico-train - INFO - Step 78400 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:32:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8007
2025-08-31 10:32:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.30e-05
2025-08-31 10:32:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:34:03 - pico-train - INFO - Step 78500 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:34:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8197
2025-08-31 10:34:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.28e-05
2025-08-31 10:34:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:35:15 - pico-train - INFO - Step 78600 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:35:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7738
2025-08-31 10:35:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.26e-05
2025-08-31 10:35:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:36:25 - pico-train - INFO - Step 78700 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:36:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7747
2025-08-31 10:36:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.24e-05
2025-08-31 10:36:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:37:37 - pico-train - INFO - Step 78800 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:37:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7643
2025-08-31 10:37:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.22e-05
2025-08-31 10:37:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:38:48 - pico-train - INFO - Step 78900 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:38:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7826
2025-08-31 10:38:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.20e-05
2025-08-31 10:38:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:39:59 - pico-train - INFO - Step 79000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 10:42:00 - pico-train - INFO - Step 79000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 10:42:00 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 10:42:01 - pico-train - INFO - Step 79000 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:42:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7848
2025-08-31 10:42:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.18e-05
2025-08-31 10:42:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:42:01 - pico-train - INFO - Step 79000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 10:43:15 - pico-train - INFO - Step 79100 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:43:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7719
2025-08-31 10:43:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.16e-05
2025-08-31 10:43:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:44:26 - pico-train - INFO - Step 79200 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:44:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7832
2025-08-31 10:44:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.14e-05
2025-08-31 10:44:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:45:37 - pico-train - INFO - Step 79300 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:45:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8105
2025-08-31 10:45:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.12e-05
2025-08-31 10:45:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:46:50 - pico-train - INFO - Step 79400 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:46:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7942
2025-08-31 10:46:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.10e-05
2025-08-31 10:46:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:48:01 - pico-train - INFO - Step 79500 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:48:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7787
2025-08-31 10:48:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.08e-05
2025-08-31 10:48:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:49:12 - pico-train - INFO - Step 79600 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:49:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8027
2025-08-31 10:49:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.06e-05
2025-08-31 10:49:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:50:23 - pico-train - INFO - Step 79700 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:50:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7814
2025-08-31 10:50:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.04e-05
2025-08-31 10:50:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:51:34 - pico-train - INFO - Step 79800 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:51:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7827
2025-08-31 10:51:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.02e-05
2025-08-31 10:51:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:52:45 - pico-train - INFO - Step 79900 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:52:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7776
2025-08-31 10:52:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.01e-05
2025-08-31 10:52:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:53:55 - pico-train - INFO - Step 80000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 10:55:57 - pico-train - INFO - Step 80000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 10:55:57 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 10:55:58 - pico-train - INFO - Step 80000 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:55:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7796
2025-08-31 10:55:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 10:55:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:55:58 - pico-train - INFO - Step 80000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 10:57:11 - pico-train - INFO - Step 80100 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:57:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8003
2025-08-31 10:57:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 10:57:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:58:23 - pico-train - INFO - Step 80200 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:58:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8047
2025-08-31 10:58:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 10:58:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 10:59:34 - pico-train - INFO - Step 80300 -- ๐Ÿ”„ Training Metrics
2025-08-31 10:59:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7894
2025-08-31 10:59:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 10:59:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:00:46 - pico-train - INFO - Step 80400 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:00:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7800
2025-08-31 11:00:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:00:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:01:57 - pico-train - INFO - Step 80500 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:01:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7781
2025-08-31 11:01:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:01:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:03:07 - pico-train - INFO - Step 80600 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:03:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7873
2025-08-31 11:03:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:03:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:04:20 - pico-train - INFO - Step 80700 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:04:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7820
2025-08-31 11:04:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:04:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:05:31 - pico-train - INFO - Step 80800 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:05:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7757
2025-08-31 11:05:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:05:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:06:42 - pico-train - INFO - Step 80900 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:06:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7693
2025-08-31 11:06:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:06:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:07:52 - pico-train - INFO - Step 81000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 11:09:54 - pico-train - INFO - Step 81000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 11:09:54 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 11:09:54 - pico-train - INFO - Step 81000 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:09:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8021
2025-08-31 11:09:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:09:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:09:54 - pico-train - INFO - Step 81000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 11:11:08 - pico-train - INFO - Step 81100 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:11:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7742
2025-08-31 11:11:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:11:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:12:19 - pico-train - INFO - Step 81200 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:12:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7801
2025-08-31 11:12:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:12:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:13:31 - pico-train - INFO - Step 81300 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:13:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7931
2025-08-31 11:13:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:13:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:14:42 - pico-train - INFO - Step 81400 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:14:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7744
2025-08-31 11:14:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:14:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:15:53 - pico-train - INFO - Step 81500 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:15:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7913
2025-08-31 11:15:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:15:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:17:04 - pico-train - INFO - Step 81600 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:17:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7748
2025-08-31 11:17:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:17:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:18:16 - pico-train - INFO - Step 81700 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:18:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7952
2025-08-31 11:18:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:18:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:19:27 - pico-train - INFO - Step 81800 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:19:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7844
2025-08-31 11:19:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:19:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:20:38 - pico-train - INFO - Step 81900 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:20:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7869
2025-08-31 11:20:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:20:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:21:49 - pico-train - INFO - Step 82000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 11:23:50 - pico-train - INFO - Step 82000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 11:23:50 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 11:23:51 - pico-train - INFO - Step 82000 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:23:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7823
2025-08-31 11:23:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:23:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:23:51 - pico-train - INFO - Step 82000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 11:25:05 - pico-train - INFO - Step 82100 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:25:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7911
2025-08-31 11:25:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:25:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:26:16 - pico-train - INFO - Step 82200 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:26:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7879
2025-08-31 11:26:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:26:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:27:27 - pico-train - INFO - Step 82300 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:27:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7873
2025-08-31 11:27:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:27:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:28:38 - pico-train - INFO - Step 82400 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:28:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7660
2025-08-31 11:28:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:28:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:29:49 - pico-train - INFO - Step 82500 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:29:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7860
2025-08-31 11:29:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:29:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:31:00 - pico-train - INFO - Step 82600 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:31:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8142
2025-08-31 11:31:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:31:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:32:11 - pico-train - INFO - Step 82700 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:32:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7781
2025-08-31 11:32:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:32:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:33:23 - pico-train - INFO - Step 82800 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:33:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7692
2025-08-31 11:33:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:33:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:34:33 - pico-train - INFO - Step 82900 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:34:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7853
2025-08-31 11:34:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:34:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:35:44 - pico-train - INFO - Step 83000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 11:37:45 - pico-train - INFO - Step 83000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 11:37:45 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 11:37:46 - pico-train - INFO - Step 83000 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:37:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7917
2025-08-31 11:37:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:37:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:37:46 - pico-train - INFO - Step 83000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 11:39:00 - pico-train - INFO - Step 83100 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:39:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7694
2025-08-31 11:39:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:39:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:40:12 - pico-train - INFO - Step 83200 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:40:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7764
2025-08-31 11:40:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:40:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:41:23 - pico-train - INFO - Step 83300 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:41:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7760
2025-08-31 11:41:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:41:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:42:34 - pico-train - INFO - Step 83400 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:42:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7765
2025-08-31 11:42:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:42:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:43:45 - pico-train - INFO - Step 83500 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:43:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7858
2025-08-31 11:43:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:43:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:44:56 - pico-train - INFO - Step 83600 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:44:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7846
2025-08-31 11:44:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:44:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:46:08 - pico-train - INFO - Step 83700 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:46:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7743
2025-08-31 11:46:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:46:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:47:19 - pico-train - INFO - Step 83800 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:47:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7898
2025-08-31 11:47:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:47:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:48:30 - pico-train - INFO - Step 83900 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:48:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7867
2025-08-31 11:48:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:48:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:49:40 - pico-train - INFO - Step 84000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 11:51:42 - pico-train - INFO - Step 84000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 11:51:42 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 11:51:43 - pico-train - INFO - Step 84000 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:51:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7616
2025-08-31 11:51:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:51:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:51:43 - pico-train - INFO - Step 84000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 11:52:56 - pico-train - INFO - Step 84100 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:52:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7942
2025-08-31 11:52:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:52:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:54:08 - pico-train - INFO - Step 84200 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:54:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7937
2025-08-31 11:54:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:54:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:55:18 - pico-train - INFO - Step 84300 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:55:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7635
2025-08-31 11:55:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:55:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:56:29 - pico-train - INFO - Step 84400 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:56:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7841
2025-08-31 11:56:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:56:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:57:41 - pico-train - INFO - Step 84500 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:57:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7763
2025-08-31 11:57:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:57:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 11:58:52 - pico-train - INFO - Step 84600 -- ๐Ÿ”„ Training Metrics
2025-08-31 11:58:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7988
2025-08-31 11:58:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 11:58:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:00:03 - pico-train - INFO - Step 84700 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:00:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7889
2025-08-31 12:00:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:00:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:01:14 - pico-train - INFO - Step 84800 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:01:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7821
2025-08-31 12:01:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:01:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:02:26 - pico-train - INFO - Step 84900 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:02:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7949
2025-08-31 12:02:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:02:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:03:36 - pico-train - INFO - Step 85000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 12:05:38 - pico-train - INFO - Step 85000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 12:05:38 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 12:05:38 - pico-train - INFO - Step 85000 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:05:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8050
2025-08-31 12:05:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:05:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:05:38 - pico-train - INFO - Step 85000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 12:06:52 - pico-train - INFO - Step 85100 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:06:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7980
2025-08-31 12:06:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:06:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:08:03 - pico-train - INFO - Step 85200 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:08:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8045
2025-08-31 12:08:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:08:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:09:14 - pico-train - INFO - Step 85300 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:09:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7894
2025-08-31 12:09:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:09:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:10:26 - pico-train - INFO - Step 85400 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:10:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7978
2025-08-31 12:10:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:10:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:11:37 - pico-train - INFO - Step 85500 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:11:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7824
2025-08-31 12:11:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:11:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:12:48 - pico-train - INFO - Step 85600 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:12:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7912
2025-08-31 12:12:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:12:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:13:59 - pico-train - INFO - Step 85700 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:13:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7705
2025-08-31 12:13:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:13:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:15:10 - pico-train - INFO - Step 85800 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:15:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7781
2025-08-31 12:15:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:15:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:16:22 - pico-train - INFO - Step 85900 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:16:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8135
2025-08-31 12:16:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:16:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:17:32 - pico-train - INFO - Step 86000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 12:19:33 - pico-train - INFO - Step 86000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 12:19:33 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 12:19:34 - pico-train - INFO - Step 86000 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:19:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7768
2025-08-31 12:19:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:19:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:19:34 - pico-train - INFO - Step 86000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 12:20:48 - pico-train - INFO - Step 86100 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:20:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7966
2025-08-31 12:20:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:20:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:21:59 - pico-train - INFO - Step 86200 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:21:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7856
2025-08-31 12:21:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:21:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:23:10 - pico-train - INFO - Step 86300 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:23:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7666
2025-08-31 12:23:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:23:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:24:21 - pico-train - INFO - Step 86400 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:24:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8334
2025-08-31 12:24:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:24:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:25:32 - pico-train - INFO - Step 86500 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:25:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7574
2025-08-31 12:25:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:25:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:26:43 - pico-train - INFO - Step 86600 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:26:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7646
2025-08-31 12:26:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:26:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:27:54 - pico-train - INFO - Step 86700 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:27:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7596
2025-08-31 12:27:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:27:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:29:06 - pico-train - INFO - Step 86800 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:29:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7842
2025-08-31 12:29:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:29:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:30:17 - pico-train - INFO - Step 86900 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:30:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7731
2025-08-31 12:30:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:30:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:31:27 - pico-train - INFO - Step 87000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 12:33:28 - pico-train - INFO - Step 87000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 12:33:28 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 12:33:29 - pico-train - INFO - Step 87000 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:33:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7844
2025-08-31 12:33:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:33:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:33:29 - pico-train - INFO - Step 87000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 12:34:43 - pico-train - INFO - Step 87100 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:34:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7705
2025-08-31 12:34:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:34:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:35:54 - pico-train - INFO - Step 87200 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:35:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7808
2025-08-31 12:35:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:35:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:37:06 - pico-train - INFO - Step 87300 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:37:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7454
2025-08-31 12:37:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:37:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:38:16 - pico-train - INFO - Step 87400 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:38:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7594
2025-08-31 12:38:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:38:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:39:28 - pico-train - INFO - Step 87500 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:39:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7863
2025-08-31 12:39:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:39:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:40:39 - pico-train - INFO - Step 87600 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:40:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7863
2025-08-31 12:40:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:40:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:41:50 - pico-train - INFO - Step 87700 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:41:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7821
2025-08-31 12:41:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:41:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:43:01 - pico-train - INFO - Step 87800 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:43:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7758
2025-08-31 12:43:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:43:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:44:12 - pico-train - INFO - Step 87900 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:44:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7932
2025-08-31 12:44:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:44:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:45:22 - pico-train - INFO - Step 88000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 12:47:24 - pico-train - INFO - Step 88000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 12:47:24 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 12:47:24 - pico-train - INFO - Step 88000 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:47:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7787
2025-08-31 12:47:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:47:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:47:24 - pico-train - INFO - Step 88000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 12:48:38 - pico-train - INFO - Step 88100 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:48:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8045
2025-08-31 12:48:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:48:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:49:49 - pico-train - INFO - Step 88200 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:49:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7800
2025-08-31 12:49:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:49:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:51:00 - pico-train - INFO - Step 88300 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:51:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7828
2025-08-31 12:51:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:51:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:52:12 - pico-train - INFO - Step 88400 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:52:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7886
2025-08-31 12:52:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:52:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:53:23 - pico-train - INFO - Step 88500 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:53:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7917
2025-08-31 12:53:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:53:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:54:34 - pico-train - INFO - Step 88600 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:54:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7897
2025-08-31 12:54:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:54:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:55:45 - pico-train - INFO - Step 88700 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:55:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7861
2025-08-31 12:55:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:55:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:56:55 - pico-train - INFO - Step 88800 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:56:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7767
2025-08-31 12:56:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:56:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:58:07 - pico-train - INFO - Step 88900 -- ๐Ÿ”„ Training Metrics
2025-08-31 12:58:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7696
2025-08-31 12:58:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 12:58:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 12:59:17 - pico-train - INFO - Step 89000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 13:01:18 - pico-train - INFO - Step 89000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 13:01:18 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 13:01:19 - pico-train - INFO - Step 89000 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:01:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7836
2025-08-31 13:01:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:01:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:01:19 - pico-train - INFO - Step 89000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 13:02:33 - pico-train - INFO - Step 89100 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:02:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7812
2025-08-31 13:02:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:02:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:03:44 - pico-train - INFO - Step 89200 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:03:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7896
2025-08-31 13:03:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:03:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:04:55 - pico-train - INFO - Step 89300 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:04:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7898
2025-08-31 13:04:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:04:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:06:06 - pico-train - INFO - Step 89400 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:06:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7914
2025-08-31 13:06:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:06:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:07:17 - pico-train - INFO - Step 89500 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:07:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7922
2025-08-31 13:07:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:07:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:08:29 - pico-train - INFO - Step 89600 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:08:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7673
2025-08-31 13:08:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:08:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:09:40 - pico-train - INFO - Step 89700 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:09:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7810
2025-08-31 13:09:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:09:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:10:51 - pico-train - INFO - Step 89800 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:10:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7763
2025-08-31 13:10:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:10:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:12:02 - pico-train - INFO - Step 89900 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:12:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7684
2025-08-31 13:12:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:12:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:13:12 - pico-train - INFO - Step 90000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 13:15:13 - pico-train - INFO - Step 90000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 13:15:13 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 13:15:14 - pico-train - INFO - Step 90000 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:15:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7821
2025-08-31 13:15:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:15:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:15:14 - pico-train - INFO - Step 90000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 13:16:28 - pico-train - INFO - Step 90100 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:16:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7878
2025-08-31 13:16:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:16:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:17:38 - pico-train - INFO - Step 90200 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:17:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7773
2025-08-31 13:17:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:17:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:18:49 - pico-train - INFO - Step 90300 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:18:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7771
2025-08-31 13:18:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:18:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:20:00 - pico-train - INFO - Step 90400 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:20:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7744
2025-08-31 13:20:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:20:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:21:11 - pico-train - INFO - Step 90500 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:21:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7964
2025-08-31 13:21:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:21:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:22:21 - pico-train - INFO - Step 90600 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:22:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7864
2025-08-31 13:22:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:22:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:23:32 - pico-train - INFO - Step 90700 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:23:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7873
2025-08-31 13:23:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:23:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:24:43 - pico-train - INFO - Step 90800 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:24:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7767
2025-08-31 13:24:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:24:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:25:54 - pico-train - INFO - Step 90900 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:25:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7846
2025-08-31 13:25:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:25:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:27:04 - pico-train - INFO - Step 91000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 13:29:06 - pico-train - INFO - Step 91000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 13:29:06 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 13:29:07 - pico-train - INFO - Step 91000 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:29:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7906
2025-08-31 13:29:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:29:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:29:07 - pico-train - INFO - Step 91000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 13:30:20 - pico-train - INFO - Step 91100 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:30:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7907
2025-08-31 13:30:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:30:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:31:31 - pico-train - INFO - Step 91200 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:31:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7947
2025-08-31 13:31:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:31:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:32:43 - pico-train - INFO - Step 91300 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:32:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7752
2025-08-31 13:32:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:32:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:33:54 - pico-train - INFO - Step 91400 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:33:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7782
2025-08-31 13:33:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:33:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:35:04 - pico-train - INFO - Step 91500 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:35:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7775
2025-08-31 13:35:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:35:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:36:15 - pico-train - INFO - Step 91600 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:36:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7964
2025-08-31 13:36:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:36:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:37:26 - pico-train - INFO - Step 91700 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:37:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7612
2025-08-31 13:37:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:37:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:38:37 - pico-train - INFO - Step 91800 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:38:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7855
2025-08-31 13:38:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:38:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:39:49 - pico-train - INFO - Step 91900 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:39:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7587
2025-08-31 13:39:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:39:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:40:59 - pico-train - INFO - Step 92000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 13:43:01 - pico-train - INFO - Step 92000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 13:43:01 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 13:43:01 - pico-train - INFO - Step 92000 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:43:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7725
2025-08-31 13:43:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:43:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:43:01 - pico-train - INFO - Step 92000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 13:44:15 - pico-train - INFO - Step 92100 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:44:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7884
2025-08-31 13:44:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:44:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:45:27 - pico-train - INFO - Step 92200 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:45:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7677
2025-08-31 13:45:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:45:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:46:38 - pico-train - INFO - Step 92300 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:46:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7904
2025-08-31 13:46:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:46:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:47:49 - pico-train - INFO - Step 92400 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:47:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7862
2025-08-31 13:47:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:47:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:49:00 - pico-train - INFO - Step 92500 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:49:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8080
2025-08-31 13:49:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:49:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:50:11 - pico-train - INFO - Step 92600 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:50:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7588
2025-08-31 13:50:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:50:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:51:22 - pico-train - INFO - Step 92700 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:51:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8001
2025-08-31 13:51:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:51:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:52:33 - pico-train - INFO - Step 92800 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:52:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8003
2025-08-31 13:52:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:52:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:53:44 - pico-train - INFO - Step 92900 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:53:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7780
2025-08-31 13:53:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:53:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:54:54 - pico-train - INFO - Step 93000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 13:56:55 - pico-train - INFO - Step 93000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 13:56:55 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 13:56:56 - pico-train - INFO - Step 93000 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:56:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7952
2025-08-31 13:56:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:56:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:56:56 - pico-train - INFO - Step 93000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 13:58:10 - pico-train - INFO - Step 93100 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:58:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7548
2025-08-31 13:58:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:58:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 13:59:22 - pico-train - INFO - Step 93200 -- ๐Ÿ”„ Training Metrics
2025-08-31 13:59:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7780
2025-08-31 13:59:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 13:59:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:00:33 - pico-train - INFO - Step 93300 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:00:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7779
2025-08-31 14:00:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:00:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:01:43 - pico-train - INFO - Step 93400 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:01:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7857
2025-08-31 14:01:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:01:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:02:55 - pico-train - INFO - Step 93500 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:02:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7840
2025-08-31 14:02:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:02:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:04:06 - pico-train - INFO - Step 93600 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:04:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7887
2025-08-31 14:04:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:04:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:05:18 - pico-train - INFO - Step 93700 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:05:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7694
2025-08-31 14:05:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:05:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:06:28 - pico-train - INFO - Step 93800 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:06:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7762
2025-08-31 14:06:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:06:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:07:39 - pico-train - INFO - Step 93900 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:07:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7933
2025-08-31 14:07:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:07:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:08:50 - pico-train - INFO - Step 94000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 14:10:52 - pico-train - INFO - Step 94000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 14:10:52 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 14:10:53 - pico-train - INFO - Step 94000 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:10:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7827
2025-08-31 14:10:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:10:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:10:53 - pico-train - INFO - Step 94000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 14:12:07 - pico-train - INFO - Step 94100 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:12:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7810
2025-08-31 14:12:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:12:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:13:18 - pico-train - INFO - Step 94200 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:13:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7890
2025-08-31 14:13:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:13:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:14:28 - pico-train - INFO - Step 94300 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:14:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7834
2025-08-31 14:14:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:14:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:15:39 - pico-train - INFO - Step 94400 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:15:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7593
2025-08-31 14:15:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:15:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:16:51 - pico-train - INFO - Step 94500 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:16:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7865
2025-08-31 14:16:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:16:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:18:02 - pico-train - INFO - Step 94600 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:18:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7815
2025-08-31 14:18:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:18:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:19:13 - pico-train - INFO - Step 94700 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:19:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7722
2025-08-31 14:19:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:19:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:20:24 - pico-train - INFO - Step 94800 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:20:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7647
2025-08-31 14:20:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:20:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:21:35 - pico-train - INFO - Step 94900 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:21:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7664
2025-08-31 14:21:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:21:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:22:45 - pico-train - INFO - Step 95000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 14:24:47 - pico-train - INFO - Step 95000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 14:24:47 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 14:24:47 - pico-train - INFO - Step 95000 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:24:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7738
2025-08-31 14:24:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:24:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:24:47 - pico-train - INFO - Step 95000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 14:26:01 - pico-train - INFO - Step 95100 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:26:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7868
2025-08-31 14:26:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:26:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:27:12 - pico-train - INFO - Step 95200 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:27:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7701
2025-08-31 14:27:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:27:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:28:23 - pico-train - INFO - Step 95300 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:28:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7908
2025-08-31 14:28:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:28:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:29:33 - pico-train - INFO - Step 95400 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:29:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7643
2025-08-31 14:29:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:29:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:30:44 - pico-train - INFO - Step 95500 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:30:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8021
2025-08-31 14:30:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:30:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:32:19 - pico-train - INFO - Step 95600 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:32:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7821
2025-08-31 14:32:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:32:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:34:06 - pico-train - INFO - Step 95700 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:34:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7956
2025-08-31 14:34:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:34:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:35:52 - pico-train - INFO - Step 95800 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:35:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7821
2025-08-31 14:35:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:35:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:37:39 - pico-train - INFO - Step 95900 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:37:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7720
2025-08-31 14:37:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:37:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:39:24 - pico-train - INFO - Step 96000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 14:41:46 - pico-train - INFO - Step 96000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 14:41:46 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 14:41:48 - pico-train - INFO - Step 96000 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:41:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7744
2025-08-31 14:41:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:41:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:41:48 - pico-train - INFO - Step 96000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 14:43:01 - pico-train - INFO - Step 96100 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:43:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7927
2025-08-31 14:43:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:43:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:44:12 - pico-train - INFO - Step 96200 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:44:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7880
2025-08-31 14:44:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:44:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:45:24 - pico-train - INFO - Step 96300 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:45:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7508
2025-08-31 14:45:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:45:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:46:35 - pico-train - INFO - Step 96400 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:46:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8135
2025-08-31 14:46:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:46:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:47:46 - pico-train - INFO - Step 96500 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:47:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7807
2025-08-31 14:47:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:47:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:48:56 - pico-train - INFO - Step 96600 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:48:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7726
2025-08-31 14:48:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:48:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:50:08 - pico-train - INFO - Step 96700 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:50:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7978
2025-08-31 14:50:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:50:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:51:19 - pico-train - INFO - Step 96800 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:51:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7685
2025-08-31 14:51:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:51:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:52:30 - pico-train - INFO - Step 96900 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:52:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7788
2025-08-31 14:52:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:52:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:53:40 - pico-train - INFO - Step 97000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 14:55:42 - pico-train - INFO - Step 97000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 14:55:42 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 14:55:43 - pico-train - INFO - Step 97000 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:55:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7607
2025-08-31 14:55:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:55:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:55:43 - pico-train - INFO - Step 97000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 14:56:57 - pico-train - INFO - Step 97100 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:56:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7780
2025-08-31 14:56:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:56:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:58:08 - pico-train - INFO - Step 97200 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:58:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7450
2025-08-31 14:58:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:58:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 14:59:20 - pico-train - INFO - Step 97300 -- ๐Ÿ”„ Training Metrics
2025-08-31 14:59:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7597
2025-08-31 14:59:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 14:59:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:00:31 - pico-train - INFO - Step 97400 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:00:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7810
2025-08-31 15:00:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:00:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:01:42 - pico-train - INFO - Step 97500 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:01:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7981
2025-08-31 15:01:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:01:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:02:53 - pico-train - INFO - Step 97600 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:02:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7707
2025-08-31 15:02:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:02:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:04:04 - pico-train - INFO - Step 97700 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:04:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8093
2025-08-31 15:04:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:04:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:05:16 - pico-train - INFO - Step 97800 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:05:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7745
2025-08-31 15:05:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:05:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:06:26 - pico-train - INFO - Step 97900 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:06:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7663
2025-08-31 15:06:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:06:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:07:37 - pico-train - INFO - Step 98000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 15:09:38 - pico-train - INFO - Step 98000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 15:09:38 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 15:09:39 - pico-train - INFO - Step 98000 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:09:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7693
2025-08-31 15:09:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:09:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:09:39 - pico-train - INFO - Step 98000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 15:10:53 - pico-train - INFO - Step 98100 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:10:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7886
2025-08-31 15:10:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:10:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:12:04 - pico-train - INFO - Step 98200 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:12:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7912
2025-08-31 15:12:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:12:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:13:16 - pico-train - INFO - Step 98300 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:13:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7646
2025-08-31 15:13:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:13:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:14:26 - pico-train - INFO - Step 98400 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:14:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8104
2025-08-31 15:14:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:14:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:15:37 - pico-train - INFO - Step 98500 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:15:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7713
2025-08-31 15:15:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:15:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:16:49 - pico-train - INFO - Step 98600 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:16:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.8066
2025-08-31 15:16:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:16:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:18:00 - pico-train - INFO - Step 98700 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:18:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7834
2025-08-31 15:18:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:18:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:19:11 - pico-train - INFO - Step 98800 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:19:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7803
2025-08-31 15:19:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:19:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:20:22 - pico-train - INFO - Step 98900 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:20:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7487
2025-08-31 15:20:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:20:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:21:33 - pico-train - INFO - Step 99000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 15:23:34 - pico-train - INFO - Step 99000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 15:23:34 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 15:23:35 - pico-train - INFO - Step 99000 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:23:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7896
2025-08-31 15:23:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:23:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:23:35 - pico-train - INFO - Step 99000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-31 15:24:49 - pico-train - INFO - Step 99100 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:24:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7683
2025-08-31 15:24:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:24:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:26:00 - pico-train - INFO - Step 99200 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:26:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7706
2025-08-31 15:26:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:26:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:27:11 - pico-train - INFO - Step 99300 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:27:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7859
2025-08-31 15:27:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:27:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:28:22 - pico-train - INFO - Step 99400 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:28:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7737
2025-08-31 15:28:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:28:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:29:34 - pico-train - INFO - Step 99500 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:29:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7550
2025-08-31 15:29:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:29:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:30:45 - pico-train - INFO - Step 99600 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:30:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7528
2025-08-31 15:30:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:30:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:31:57 - pico-train - INFO - Step 99700 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:31:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7695
2025-08-31 15:31:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:31:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:33:08 - pico-train - INFO - Step 99800 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:33:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7800
2025-08-31 15:33:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:33:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:34:20 - pico-train - INFO - Step 99900 -- ๐Ÿ”„ Training Metrics
2025-08-31 15:34:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 4.7969
2025-08-31 15:34:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-31 15:34:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-31 15:35:30 - pico-train - INFO - Step 100000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-31 15:37:32 - pico-train - INFO - Step 100000 -- ๐Ÿ“Š Evaluation Results
2025-08-31 15:37:32 - pico-train - INFO - โ””โ”€โ”€ paloma: inf
2025-08-31 15:37:32 - pico-train - INFO - ๐ŸŽ‰ Training complete! Final step: 100000