ThomasTheMaker's picture
Upload folder using huggingface_hub
98e5f7b verified
raw
history blame
435 kB
2025-08-30 04:44:44 - pico-train - INFO - Step 40000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 04:44:44 - pico-train - INFO - โ””โ”€โ”€ paloma: 7.314096757540847e+26
2025-08-30 04:44:44 - pico-train - INFO - ==================================================
2025-08-30 04:44:44 - pico-train - INFO - โœจ Training Configuration
2025-08-30 04:44:44 - pico-train - INFO - ==================================================
2025-08-30 04:44:44 - pico-train - INFO - โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ checkpointing: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ checkpoints_dir: checkpoints โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ evaluation: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ eval_results_dir: eval_results โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ fabric_checkpoint_dir: fabric_state โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ fabric_checkpoint_filename: checkpoint.pt โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ hf_checkpoint: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ collection_slug: null โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ repo_id: ThomasTheMaker/pico-decoder-tiny โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ learning_dynamics: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ batch_size: 1 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ eval_data: null โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ layer_suffixes: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ - attention.v_proj โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ - attention.o_proj โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ - swiglu.w_2 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ sequence_idx: -1 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ learning_dynamics_dir: learning_dynamics โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ logs_dir: logs โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ run_name: pico-decoder-tiny-dolma5M-v1 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ runs_dir: runs โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ save_every_n_steps: 500 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ save_to_hf: true โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ training: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ auto_resume: true โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ data: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ dataloader: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ batch_size: 4 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ dataset: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ name: ThomasTheMaker/pretokenized-dolma-5M โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ tokenizer: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ name: allenai/OLMo-7B-0724-hf โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ vocab_size: 50304 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ evaluation: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ metrics: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ - paloma โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ paloma: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ batch_size: 1 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ dataset_name: pico-lm/pretokenized-paloma-tinsy โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ dataset_split: val โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ max_length: 2048 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ model: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ activation_hidden_dim: 384 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ attention_n_heads: 12 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ attention_n_kv_heads: 4 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ batch_size: 1024 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ d_model: 96 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ max_seq_len: 2048 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ model_type: pico_decoder โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ n_layers: 12 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ norm_eps: 1.0e-06 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ position_emb_theta: 10000.0 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ vocab_size: 50304 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ monitoring: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ logging: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ log_every_n_steps: 25 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ log_level: INFO โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ save_to_wandb: false โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ wandb: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ entity: boymyc โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ project: pico-decoder-tiny โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ training: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ fabric: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ accelerator: cuda โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ num_devices: 1 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ num_nodes: 1 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ precision: bf16-mixed โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ max_steps: 20000 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ optimization: โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ gradient_accumulation_steps: 4 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ lr: 5.0e-05 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ lr_scheduler: cosine โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ lr_warmup_steps: 8000 โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ optimizer: adamw โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ”‚ โ”‚
2025-08-30 04:44:44 - pico-train - INFO - โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
2025-08-30 04:44:44 - pico-train - INFO - ==================================================
2025-08-30 04:44:44 - pico-train - INFO - โ›ญ Runtime Summary:
2025-08-30 04:44:44 - pico-train - INFO - ==================================================
2025-08-30 04:44:44 - pico-train - INFO - Starting from step: 40000
2025-08-30 04:44:44 - pico-train - INFO - Model Setup:
2025-08-30 04:44:44 - pico-train - INFO - โ””โ”€ Total Parameters: 11,282,784
2025-08-30 04:44:44 - pico-train - INFO - โ””โ”€ Trainable Parameters: 11,282,784
2025-08-30 04:44:44 - pico-train - INFO - Distributed Setup:
2025-08-30 04:44:44 - pico-train - INFO - โ””โ”€ Number of Devices: 1
2025-08-30 04:44:44 - pico-train - INFO - โ””โ”€ Device Type: NVIDIA GeForce RTX 5090
2025-08-30 04:44:44 - pico-train - INFO - โ””โ”€ Available Memory: 33.68 GB
2025-08-30 04:44:44 - pico-train - INFO - Software Setup:
2025-08-30 04:44:44 - pico-train - INFO - โ””โ”€ Python Version: 3.10.12
2025-08-30 04:44:44 - pico-train - INFO - โ””โ”€ PyTorch Version: 2.8.0+cu128
2025-08-30 04:44:44 - pico-train - INFO - โ””โ”€ CUDA Version: 12.8
2025-08-30 04:44:44 - pico-train - INFO - โ””โ”€ Operating System: Linux 6.8.0-63-generic
2025-08-30 04:44:44 - pico-train - INFO - Batch Size Configuration:
2025-08-30 04:44:44 - pico-train - INFO - โ””โ”€ Global Batch Size: 4
2025-08-30 04:44:44 - pico-train - INFO - โ””โ”€ Per Device Batch Size: 1
2025-08-30 04:44:44 - pico-train - INFO - โ””โ”€ Gradient Accumulation Steps: 4
2025-08-30 04:44:44 - pico-train - INFO - ==================================================
2025-08-30 04:44:45 - pico-train - INFO - Step 40000 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:44:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.3052
2025-08-30 04:44:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 5.00e-06
2025-08-30 04:44:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:44:45 - pico-train - INFO - Step 40000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 04:45:06 - pico-train - INFO - Step 40025 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:45:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1689
2025-08-30 04:45:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.65e-05
2025-08-30 04:45:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:45:22 - pico-train - INFO - Step 40050 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:45:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1212
2025-08-30 04:45:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.65e-05
2025-08-30 04:45:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:45:39 - pico-train - INFO - Step 40075 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:45:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0189
2025-08-30 04:45:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.64e-05
2025-08-30 04:45:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:45:55 - pico-train - INFO - Step 40100 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:45:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1347
2025-08-30 04:45:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.64e-05
2025-08-30 04:45:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:46:12 - pico-train - INFO - Step 40125 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:46:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1791
2025-08-30 04:46:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.64e-05
2025-08-30 04:46:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:46:28 - pico-train - INFO - Step 40150 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:46:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1368
2025-08-30 04:46:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.64e-05
2025-08-30 04:46:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:46:44 - pico-train - INFO - Step 40175 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:46:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1443
2025-08-30 04:46:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.64e-05
2025-08-30 04:46:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:47:01 - pico-train - INFO - Step 40200 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:47:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1815
2025-08-30 04:47:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.63e-05
2025-08-30 04:47:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:47:17 - pico-train - INFO - Step 40225 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:47:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1685
2025-08-30 04:47:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.63e-05
2025-08-30 04:47:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:47:34 - pico-train - INFO - Step 40250 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:47:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0835
2025-08-30 04:47:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.63e-05
2025-08-30 04:47:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:47:50 - pico-train - INFO - Step 40275 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:47:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0785
2025-08-30 04:47:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.63e-05
2025-08-30 04:47:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:48:07 - pico-train - INFO - Step 40300 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:48:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0537
2025-08-30 04:48:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.63e-05
2025-08-30 04:48:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:48:23 - pico-train - INFO - Step 40325 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:48:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0608
2025-08-30 04:48:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.63e-05
2025-08-30 04:48:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:48:36 - pico-train - INFO - Step 40350 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:48:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1696
2025-08-30 04:48:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.62e-05
2025-08-30 04:48:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:48:49 - pico-train - INFO - Step 40375 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:48:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1070
2025-08-30 04:48:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.62e-05
2025-08-30 04:48:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:49:02 - pico-train - INFO - Step 40400 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:49:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0783
2025-08-30 04:49:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.62e-05
2025-08-30 04:49:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:49:14 - pico-train - INFO - Step 40425 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:49:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.2326
2025-08-30 04:49:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.62e-05
2025-08-30 04:49:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:49:27 - pico-train - INFO - Step 40450 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:49:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0715
2025-08-30 04:49:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.62e-05
2025-08-30 04:49:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:49:39 - pico-train - INFO - Step 40475 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:49:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1857
2025-08-30 04:49:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.61e-05
2025-08-30 04:49:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:49:51 - pico-train - INFO - Step 40500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 04:51:47 - pico-train - INFO - Step 40500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 04:51:47 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.2201991301470252e+27
2025-08-30 04:51:50 - pico-train - INFO - Step 40500 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:51:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1294
2025-08-30 04:51:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.61e-05
2025-08-30 04:51:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:51:50 - pico-train - INFO - Step 40500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 04:52:05 - pico-train - INFO - Step 40525 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:52:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1508
2025-08-30 04:52:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.61e-05
2025-08-30 04:52:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:52:18 - pico-train - INFO - Step 40550 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:52:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1130
2025-08-30 04:52:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.61e-05
2025-08-30 04:52:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:52:30 - pico-train - INFO - Step 40575 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:52:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1631
2025-08-30 04:52:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.61e-05
2025-08-30 04:52:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:52:43 - pico-train - INFO - Step 40600 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:52:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.2337
2025-08-30 04:52:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.60e-05
2025-08-30 04:52:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:52:55 - pico-train - INFO - Step 40625 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:52:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0858
2025-08-30 04:52:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.60e-05
2025-08-30 04:52:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:53:08 - pico-train - INFO - Step 40650 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:53:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1727
2025-08-30 04:53:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.60e-05
2025-08-30 04:53:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:53:21 - pico-train - INFO - Step 40675 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:53:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1629
2025-08-30 04:53:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.60e-05
2025-08-30 04:53:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:53:33 - pico-train - INFO - Step 40700 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:53:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1451
2025-08-30 04:53:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.60e-05
2025-08-30 04:53:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:53:46 - pico-train - INFO - Step 40725 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:53:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1482
2025-08-30 04:53:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.59e-05
2025-08-30 04:53:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:53:58 - pico-train - INFO - Step 40750 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:53:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0939
2025-08-30 04:53:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.59e-05
2025-08-30 04:53:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:54:11 - pico-train - INFO - Step 40775 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:54:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1594
2025-08-30 04:54:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.59e-05
2025-08-30 04:54:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:54:23 - pico-train - INFO - Step 40800 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:54:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1450
2025-08-30 04:54:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.59e-05
2025-08-30 04:54:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:54:36 - pico-train - INFO - Step 40825 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:54:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0952
2025-08-30 04:54:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.59e-05
2025-08-30 04:54:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:54:48 - pico-train - INFO - Step 40850 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:54:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1180
2025-08-30 04:54:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.59e-05
2025-08-30 04:54:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:55:01 - pico-train - INFO - Step 40875 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:55:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0993
2025-08-30 04:55:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.58e-05
2025-08-30 04:55:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:55:13 - pico-train - INFO - Step 40900 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:55:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0885
2025-08-30 04:55:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.58e-05
2025-08-30 04:55:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:55:26 - pico-train - INFO - Step 40925 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:55:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0793
2025-08-30 04:55:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.58e-05
2025-08-30 04:55:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:55:39 - pico-train - INFO - Step 40950 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:55:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1996
2025-08-30 04:55:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.58e-05
2025-08-30 04:55:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:55:51 - pico-train - INFO - Step 40975 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:55:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1833
2025-08-30 04:55:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.58e-05
2025-08-30 04:55:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:56:03 - pico-train - INFO - Step 41000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 04:58:02 - pico-train - INFO - Step 41000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 04:58:02 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.2786105287360795e+27
2025-08-30 04:58:05 - pico-train - INFO - Step 41000 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:58:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0609
2025-08-30 04:58:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.57e-05
2025-08-30 04:58:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:58:05 - pico-train - INFO - Step 41000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 04:58:20 - pico-train - INFO - Step 41025 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:58:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0776
2025-08-30 04:58:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.57e-05
2025-08-30 04:58:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:58:32 - pico-train - INFO - Step 41050 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:58:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0842
2025-08-30 04:58:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.57e-05
2025-08-30 04:58:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:58:45 - pico-train - INFO - Step 41075 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:58:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0750
2025-08-30 04:58:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.57e-05
2025-08-30 04:58:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:58:57 - pico-train - INFO - Step 41100 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:58:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1881
2025-08-30 04:58:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.57e-05
2025-08-30 04:58:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:59:10 - pico-train - INFO - Step 41125 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:59:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1206
2025-08-30 04:59:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.56e-05
2025-08-30 04:59:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:59:23 - pico-train - INFO - Step 41150 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:59:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0181
2025-08-30 04:59:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.56e-05
2025-08-30 04:59:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:59:35 - pico-train - INFO - Step 41175 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:59:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.2113
2025-08-30 04:59:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.56e-05
2025-08-30 04:59:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 04:59:48 - pico-train - INFO - Step 41200 -- ๐Ÿ”„ Training Metrics
2025-08-30 04:59:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1853
2025-08-30 04:59:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.56e-05
2025-08-30 04:59:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:00:00 - pico-train - INFO - Step 41225 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:00:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0819
2025-08-30 05:00:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.56e-05
2025-08-30 05:00:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:00:13 - pico-train - INFO - Step 41250 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:00:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0575
2025-08-30 05:00:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.55e-05
2025-08-30 05:00:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:00:25 - pico-train - INFO - Step 41275 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:00:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0731
2025-08-30 05:00:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.55e-05
2025-08-30 05:00:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:00:38 - pico-train - INFO - Step 41300 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:00:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0200
2025-08-30 05:00:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.55e-05
2025-08-30 05:00:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:00:50 - pico-train - INFO - Step 41325 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:00:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0379
2025-08-30 05:00:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.55e-05
2025-08-30 05:00:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:01:03 - pico-train - INFO - Step 41350 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:01:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0660
2025-08-30 05:01:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.55e-05
2025-08-30 05:01:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:01:15 - pico-train - INFO - Step 41375 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:01:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1597
2025-08-30 05:01:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.54e-05
2025-08-30 05:01:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:01:28 - pico-train - INFO - Step 41400 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:01:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0449
2025-08-30 05:01:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.54e-05
2025-08-30 05:01:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:01:41 - pico-train - INFO - Step 41425 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:01:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1370
2025-08-30 05:01:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.54e-05
2025-08-30 05:01:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:01:53 - pico-train - INFO - Step 41450 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:01:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1647
2025-08-30 05:01:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.54e-05
2025-08-30 05:01:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:02:06 - pico-train - INFO - Step 41475 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:02:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0793
2025-08-30 05:02:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.54e-05
2025-08-30 05:02:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:02:18 - pico-train - INFO - Step 41500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 05:04:19 - pico-train - INFO - Step 41500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 05:04:19 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.062057669347938e+27
2025-08-30 05:04:23 - pico-train - INFO - Step 41500 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:04:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0860
2025-08-30 05:04:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.54e-05
2025-08-30 05:04:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:04:23 - pico-train - INFO - Step 41500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 05:04:39 - pico-train - INFO - Step 41525 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:04:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0604
2025-08-30 05:04:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.53e-05
2025-08-30 05:04:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:04:51 - pico-train - INFO - Step 41550 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:04:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0622
2025-08-30 05:04:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.53e-05
2025-08-30 05:04:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:05:04 - pico-train - INFO - Step 41575 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:05:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0831
2025-08-30 05:05:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.53e-05
2025-08-30 05:05:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:05:16 - pico-train - INFO - Step 41600 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:05:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0853
2025-08-30 05:05:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.53e-05
2025-08-30 05:05:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:05:29 - pico-train - INFO - Step 41625 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:05:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0860
2025-08-30 05:05:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.53e-05
2025-08-30 05:05:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:05:41 - pico-train - INFO - Step 41650 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:05:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0905
2025-08-30 05:05:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.52e-05
2025-08-30 05:05:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:05:54 - pico-train - INFO - Step 41675 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:05:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0475
2025-08-30 05:05:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.52e-05
2025-08-30 05:05:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:06:07 - pico-train - INFO - Step 41700 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:06:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1168
2025-08-30 05:06:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.52e-05
2025-08-30 05:06:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:06:19 - pico-train - INFO - Step 41725 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:06:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1310
2025-08-30 05:06:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.52e-05
2025-08-30 05:06:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:06:32 - pico-train - INFO - Step 41750 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:06:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0966
2025-08-30 05:06:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.52e-05
2025-08-30 05:06:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:06:44 - pico-train - INFO - Step 41775 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:06:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1002
2025-08-30 05:06:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.51e-05
2025-08-30 05:06:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:06:57 - pico-train - INFO - Step 41800 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:06:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1383
2025-08-30 05:06:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.51e-05
2025-08-30 05:06:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:07:09 - pico-train - INFO - Step 41825 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:07:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0973
2025-08-30 05:07:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.51e-05
2025-08-30 05:07:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:07:22 - pico-train - INFO - Step 41850 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:07:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0864
2025-08-30 05:07:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.51e-05
2025-08-30 05:07:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:07:34 - pico-train - INFO - Step 41875 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:07:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1542
2025-08-30 05:07:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.51e-05
2025-08-30 05:07:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:07:47 - pico-train - INFO - Step 41900 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:07:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1191
2025-08-30 05:07:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.50e-05
2025-08-30 05:07:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:07:59 - pico-train - INFO - Step 41925 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:07:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1827
2025-08-30 05:07:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.50e-05
2025-08-30 05:07:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:08:12 - pico-train - INFO - Step 41950 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:08:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1001
2025-08-30 05:08:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.50e-05
2025-08-30 05:08:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:08:24 - pico-train - INFO - Step 41975 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:08:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1700
2025-08-30 05:08:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.50e-05
2025-08-30 05:08:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:08:36 - pico-train - INFO - Step 42000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 05:10:36 - pico-train - INFO - Step 42000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 05:10:36 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.5987478678619155e+27
2025-08-30 05:10:39 - pico-train - INFO - Step 42000 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:10:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1167
2025-08-30 05:10:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.50e-05
2025-08-30 05:10:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:10:39 - pico-train - INFO - Step 42000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 05:10:57 - pico-train - INFO - Step 42025 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:10:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1833
2025-08-30 05:10:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.49e-05
2025-08-30 05:10:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:11:09 - pico-train - INFO - Step 42050 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:11:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0939
2025-08-30 05:11:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.49e-05
2025-08-30 05:11:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:11:22 - pico-train - INFO - Step 42075 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:11:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0309
2025-08-30 05:11:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.49e-05
2025-08-30 05:11:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:11:34 - pico-train - INFO - Step 42100 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:11:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0340
2025-08-30 05:11:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.49e-05
2025-08-30 05:11:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:11:47 - pico-train - INFO - Step 42125 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:11:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0556
2025-08-30 05:11:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.49e-05
2025-08-30 05:11:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:11:59 - pico-train - INFO - Step 42150 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:11:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1500
2025-08-30 05:11:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.48e-05
2025-08-30 05:11:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:12:12 - pico-train - INFO - Step 42175 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:12:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1793
2025-08-30 05:12:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.48e-05
2025-08-30 05:12:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:12:25 - pico-train - INFO - Step 42200 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:12:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0804
2025-08-30 05:12:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.48e-05
2025-08-30 05:12:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:12:37 - pico-train - INFO - Step 42225 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:12:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1646
2025-08-30 05:12:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.48e-05
2025-08-30 05:12:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:12:50 - pico-train - INFO - Step 42250 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:12:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1414
2025-08-30 05:12:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.48e-05
2025-08-30 05:12:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:13:02 - pico-train - INFO - Step 42275 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:13:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0790
2025-08-30 05:13:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.47e-05
2025-08-30 05:13:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:13:15 - pico-train - INFO - Step 42300 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:13:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0907
2025-08-30 05:13:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.47e-05
2025-08-30 05:13:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:13:27 - pico-train - INFO - Step 42325 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:13:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1426
2025-08-30 05:13:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.47e-05
2025-08-30 05:13:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:13:40 - pico-train - INFO - Step 42350 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:13:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1071
2025-08-30 05:13:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.47e-05
2025-08-30 05:13:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:13:52 - pico-train - INFO - Step 42375 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:13:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0071
2025-08-30 05:13:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.47e-05
2025-08-30 05:13:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:14:05 - pico-train - INFO - Step 42400 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:14:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1562
2025-08-30 05:14:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.46e-05
2025-08-30 05:14:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:14:18 - pico-train - INFO - Step 42425 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:14:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1296
2025-08-30 05:14:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.46e-05
2025-08-30 05:14:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:14:30 - pico-train - INFO - Step 42450 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:14:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1257
2025-08-30 05:14:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.46e-05
2025-08-30 05:14:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:14:43 - pico-train - INFO - Step 42475 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:14:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1398
2025-08-30 05:14:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.46e-05
2025-08-30 05:14:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:14:55 - pico-train - INFO - Step 42500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 05:16:58 - pico-train - INFO - Step 42500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 05:16:58 - pico-train - INFO - โ””โ”€โ”€ paloma: 3.0154563482458477e+27
2025-08-30 05:17:01 - pico-train - INFO - Step 42500 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:17:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0496
2025-08-30 05:17:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.46e-05
2025-08-30 05:17:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:17:01 - pico-train - INFO - Step 42500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 05:17:16 - pico-train - INFO - Step 42525 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:17:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0819
2025-08-30 05:17:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.45e-05
2025-08-30 05:17:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:17:29 - pico-train - INFO - Step 42550 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:17:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0871
2025-08-30 05:17:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.45e-05
2025-08-30 05:17:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:17:42 - pico-train - INFO - Step 42575 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:17:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0924
2025-08-30 05:17:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.45e-05
2025-08-30 05:17:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:17:54 - pico-train - INFO - Step 42600 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:17:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0553
2025-08-30 05:17:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.45e-05
2025-08-30 05:17:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:18:07 - pico-train - INFO - Step 42625 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:18:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1371
2025-08-30 05:18:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.45e-05
2025-08-30 05:18:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:18:19 - pico-train - INFO - Step 42650 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:18:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0776
2025-08-30 05:18:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.44e-05
2025-08-30 05:18:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:18:32 - pico-train - INFO - Step 42675 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:18:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1134
2025-08-30 05:18:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.44e-05
2025-08-30 05:18:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:18:45 - pico-train - INFO - Step 42700 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:18:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9718
2025-08-30 05:18:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.44e-05
2025-08-30 05:18:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:18:57 - pico-train - INFO - Step 42725 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:18:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0381
2025-08-30 05:18:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.44e-05
2025-08-30 05:18:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:19:10 - pico-train - INFO - Step 42750 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:19:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1626
2025-08-30 05:19:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.44e-05
2025-08-30 05:19:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:19:22 - pico-train - INFO - Step 42775 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:19:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0909
2025-08-30 05:19:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.43e-05
2025-08-30 05:19:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:19:35 - pico-train - INFO - Step 42800 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:19:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1275
2025-08-30 05:19:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.43e-05
2025-08-30 05:19:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:19:47 - pico-train - INFO - Step 42825 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:19:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0942
2025-08-30 05:19:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.43e-05
2025-08-30 05:19:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:20:00 - pico-train - INFO - Step 42850 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:20:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0309
2025-08-30 05:20:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.43e-05
2025-08-30 05:20:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:20:12 - pico-train - INFO - Step 42875 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:20:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1312
2025-08-30 05:20:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.43e-05
2025-08-30 05:20:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:20:25 - pico-train - INFO - Step 42900 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:20:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1728
2025-08-30 05:20:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.43e-05
2025-08-30 05:20:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:20:38 - pico-train - INFO - Step 42925 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:20:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9740
2025-08-30 05:20:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.42e-05
2025-08-30 05:20:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:20:50 - pico-train - INFO - Step 42950 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:20:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0812
2025-08-30 05:20:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.42e-05
2025-08-30 05:20:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:21:03 - pico-train - INFO - Step 42975 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:21:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0484
2025-08-30 05:21:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.42e-05
2025-08-30 05:21:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:21:15 - pico-train - INFO - Step 43000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 05:23:15 - pico-train - INFO - Step 43000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 05:23:15 - pico-train - INFO - โ””โ”€โ”€ paloma: 4.4972099298583296e+27
2025-08-30 05:23:19 - pico-train - INFO - Step 43000 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:23:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.2475
2025-08-30 05:23:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.42e-05
2025-08-30 05:23:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:23:19 - pico-train - INFO - Step 43000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 05:23:36 - pico-train - INFO - Step 43025 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:23:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0959
2025-08-30 05:23:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.42e-05
2025-08-30 05:23:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:23:48 - pico-train - INFO - Step 43050 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:23:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0753
2025-08-30 05:23:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.41e-05
2025-08-30 05:23:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:24:01 - pico-train - INFO - Step 43075 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:24:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1130
2025-08-30 05:24:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.41e-05
2025-08-30 05:24:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:24:13 - pico-train - INFO - Step 43100 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:24:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0777
2025-08-30 05:24:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.41e-05
2025-08-30 05:24:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:24:26 - pico-train - INFO - Step 43125 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:24:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1311
2025-08-30 05:24:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.41e-05
2025-08-30 05:24:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:24:39 - pico-train - INFO - Step 43150 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:24:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0421
2025-08-30 05:24:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.41e-05
2025-08-30 05:24:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:24:52 - pico-train - INFO - Step 43175 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:24:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0355
2025-08-30 05:24:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.40e-05
2025-08-30 05:24:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:25:04 - pico-train - INFO - Step 43200 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:25:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0889
2025-08-30 05:25:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.40e-05
2025-08-30 05:25:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:25:17 - pico-train - INFO - Step 43225 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:25:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0605
2025-08-30 05:25:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.40e-05
2025-08-30 05:25:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:25:30 - pico-train - INFO - Step 43250 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:25:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1064
2025-08-30 05:25:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.40e-05
2025-08-30 05:25:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:25:42 - pico-train - INFO - Step 43275 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:25:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1053
2025-08-30 05:25:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.40e-05
2025-08-30 05:25:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:25:55 - pico-train - INFO - Step 43300 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:25:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1399
2025-08-30 05:25:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.39e-05
2025-08-30 05:25:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:26:07 - pico-train - INFO - Step 43325 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:26:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1271
2025-08-30 05:26:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.39e-05
2025-08-30 05:26:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:26:20 - pico-train - INFO - Step 43350 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:26:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0790
2025-08-30 05:26:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.39e-05
2025-08-30 05:26:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:26:33 - pico-train - INFO - Step 43375 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:26:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0567
2025-08-30 05:26:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.39e-05
2025-08-30 05:26:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:26:45 - pico-train - INFO - Step 43400 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:26:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0771
2025-08-30 05:26:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.39e-05
2025-08-30 05:26:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:26:58 - pico-train - INFO - Step 43425 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:26:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1399
2025-08-30 05:26:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.38e-05
2025-08-30 05:26:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:27:10 - pico-train - INFO - Step 43450 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:27:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1330
2025-08-30 05:27:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.38e-05
2025-08-30 05:27:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:27:23 - pico-train - INFO - Step 43475 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:27:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0139
2025-08-30 05:27:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.38e-05
2025-08-30 05:27:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:27:35 - pico-train - INFO - Step 43500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 05:29:43 - pico-train - INFO - Step 43500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 05:29:43 - pico-train - INFO - โ””โ”€โ”€ paloma: 5.326210528222522e+27
2025-08-30 05:29:45 - pico-train - INFO - Step 43500 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:29:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1439
2025-08-30 05:29:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.38e-05
2025-08-30 05:29:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:29:45 - pico-train - INFO - Step 43500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 05:30:00 - pico-train - INFO - Step 43525 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:30:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0445
2025-08-30 05:30:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.38e-05
2025-08-30 05:30:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:30:12 - pico-train - INFO - Step 43550 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:30:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0780
2025-08-30 05:30:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.37e-05
2025-08-30 05:30:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:30:25 - pico-train - INFO - Step 43575 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:30:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0044
2025-08-30 05:30:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.37e-05
2025-08-30 05:30:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:30:38 - pico-train - INFO - Step 43600 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:30:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0087
2025-08-30 05:30:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.37e-05
2025-08-30 05:30:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:30:50 - pico-train - INFO - Step 43625 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:30:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1263
2025-08-30 05:30:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.37e-05
2025-08-30 05:30:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:31:03 - pico-train - INFO - Step 43650 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:31:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0459
2025-08-30 05:31:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.37e-05
2025-08-30 05:31:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:31:15 - pico-train - INFO - Step 43675 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:31:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0390
2025-08-30 05:31:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.36e-05
2025-08-30 05:31:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:31:28 - pico-train - INFO - Step 43700 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:31:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0918
2025-08-30 05:31:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.36e-05
2025-08-30 05:31:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:31:41 - pico-train - INFO - Step 43725 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:31:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0426
2025-08-30 05:31:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.36e-05
2025-08-30 05:31:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:31:53 - pico-train - INFO - Step 43750 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:31:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0634
2025-08-30 05:31:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.36e-05
2025-08-30 05:31:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:32:06 - pico-train - INFO - Step 43775 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:32:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1042
2025-08-30 05:32:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.36e-05
2025-08-30 05:32:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:32:18 - pico-train - INFO - Step 43800 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:32:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0510
2025-08-30 05:32:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.35e-05
2025-08-30 05:32:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:32:31 - pico-train - INFO - Step 43825 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:32:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0403
2025-08-30 05:32:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.35e-05
2025-08-30 05:32:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:32:43 - pico-train - INFO - Step 43850 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:32:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0537
2025-08-30 05:32:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.35e-05
2025-08-30 05:32:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:32:56 - pico-train - INFO - Step 43875 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:32:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1244
2025-08-30 05:32:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.35e-05
2025-08-30 05:32:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:33:09 - pico-train - INFO - Step 43900 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:33:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1294
2025-08-30 05:33:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.35e-05
2025-08-30 05:33:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:33:21 - pico-train - INFO - Step 43925 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:33:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0845
2025-08-30 05:33:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.34e-05
2025-08-30 05:33:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:33:34 - pico-train - INFO - Step 43950 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:33:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0365
2025-08-30 05:33:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.34e-05
2025-08-30 05:33:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:33:46 - pico-train - INFO - Step 43975 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:33:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0507
2025-08-30 05:33:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.34e-05
2025-08-30 05:33:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:33:58 - pico-train - INFO - Step 44000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 05:35:55 - pico-train - INFO - Step 44000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 05:35:55 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.0515089806395209e+28
2025-08-30 05:35:57 - pico-train - INFO - Step 44000 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:35:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9669
2025-08-30 05:35:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.34e-05
2025-08-30 05:35:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:35:57 - pico-train - INFO - Step 44000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 05:36:12 - pico-train - INFO - Step 44025 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:36:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0454
2025-08-30 05:36:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.34e-05
2025-08-30 05:36:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:36:25 - pico-train - INFO - Step 44050 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:36:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0395
2025-08-30 05:36:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.33e-05
2025-08-30 05:36:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:36:38 - pico-train - INFO - Step 44075 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:36:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9733
2025-08-30 05:36:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.33e-05
2025-08-30 05:36:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:36:50 - pico-train - INFO - Step 44100 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:36:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1172
2025-08-30 05:36:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.33e-05
2025-08-30 05:36:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:37:03 - pico-train - INFO - Step 44125 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:37:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0527
2025-08-30 05:37:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.33e-05
2025-08-30 05:37:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:37:15 - pico-train - INFO - Step 44150 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:37:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0853
2025-08-30 05:37:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.33e-05
2025-08-30 05:37:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:37:28 - pico-train - INFO - Step 44175 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:37:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0303
2025-08-30 05:37:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.32e-05
2025-08-30 05:37:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:37:40 - pico-train - INFO - Step 44200 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:37:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9986
2025-08-30 05:37:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.32e-05
2025-08-30 05:37:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:37:53 - pico-train - INFO - Step 44225 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:37:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0450
2025-08-30 05:37:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.32e-05
2025-08-30 05:37:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:38:06 - pico-train - INFO - Step 44250 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:38:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0449
2025-08-30 05:38:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.32e-05
2025-08-30 05:38:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:38:18 - pico-train - INFO - Step 44275 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:38:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0811
2025-08-30 05:38:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.32e-05
2025-08-30 05:38:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:38:31 - pico-train - INFO - Step 44300 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:38:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0524
2025-08-30 05:38:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.31e-05
2025-08-30 05:38:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:38:43 - pico-train - INFO - Step 44325 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:38:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0148
2025-08-30 05:38:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.31e-05
2025-08-30 05:38:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:38:56 - pico-train - INFO - Step 44350 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:38:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0216
2025-08-30 05:38:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.31e-05
2025-08-30 05:38:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:39:08 - pico-train - INFO - Step 44375 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:39:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9966
2025-08-30 05:39:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.31e-05
2025-08-30 05:39:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:39:21 - pico-train - INFO - Step 44400 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:39:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0301
2025-08-30 05:39:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.30e-05
2025-08-30 05:39:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:39:34 - pico-train - INFO - Step 44425 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:39:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1473
2025-08-30 05:39:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.30e-05
2025-08-30 05:39:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:39:46 - pico-train - INFO - Step 44450 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:39:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0092
2025-08-30 05:39:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.30e-05
2025-08-30 05:39:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:39:59 - pico-train - INFO - Step 44475 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:39:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0807
2025-08-30 05:39:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.30e-05
2025-08-30 05:39:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:40:11 - pico-train - INFO - Step 44500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 05:42:28 - pico-train - INFO - Step 44500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 05:42:28 - pico-train - INFO - โ””โ”€โ”€ paloma: 9.953158679071717e+27
2025-08-30 05:42:31 - pico-train - INFO - Step 44500 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:42:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0974
2025-08-30 05:42:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.30e-05
2025-08-30 05:42:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:42:31 - pico-train - INFO - Step 44500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 05:42:46 - pico-train - INFO - Step 44525 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:42:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0606
2025-08-30 05:42:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.29e-05
2025-08-30 05:42:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:42:59 - pico-train - INFO - Step 44550 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:42:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0374
2025-08-30 05:42:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.29e-05
2025-08-30 05:42:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:43:11 - pico-train - INFO - Step 44575 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:43:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9995
2025-08-30 05:43:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.29e-05
2025-08-30 05:43:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:43:24 - pico-train - INFO - Step 44600 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:43:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0354
2025-08-30 05:43:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.29e-05
2025-08-30 05:43:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:43:36 - pico-train - INFO - Step 44625 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:43:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0512
2025-08-30 05:43:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.29e-05
2025-08-30 05:43:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:43:49 - pico-train - INFO - Step 44650 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:43:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9998
2025-08-30 05:43:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.28e-05
2025-08-30 05:43:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:44:01 - pico-train - INFO - Step 44675 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:44:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0010
2025-08-30 05:44:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.28e-05
2025-08-30 05:44:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:44:14 - pico-train - INFO - Step 44700 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:44:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0795
2025-08-30 05:44:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.28e-05
2025-08-30 05:44:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:44:26 - pico-train - INFO - Step 44725 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:44:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0255
2025-08-30 05:44:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.28e-05
2025-08-30 05:44:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:44:39 - pico-train - INFO - Step 44750 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:44:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0648
2025-08-30 05:44:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.28e-05
2025-08-30 05:44:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:44:52 - pico-train - INFO - Step 44775 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:44:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0873
2025-08-30 05:44:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.27e-05
2025-08-30 05:44:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:45:04 - pico-train - INFO - Step 44800 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:45:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0366
2025-08-30 05:45:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.27e-05
2025-08-30 05:45:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:45:17 - pico-train - INFO - Step 44825 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:45:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0182
2025-08-30 05:45:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.27e-05
2025-08-30 05:45:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:45:29 - pico-train - INFO - Step 44850 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:45:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0006
2025-08-30 05:45:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.27e-05
2025-08-30 05:45:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:45:42 - pico-train - INFO - Step 44875 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:45:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0773
2025-08-30 05:45:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.27e-05
2025-08-30 05:45:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:45:54 - pico-train - INFO - Step 44900 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:45:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0644
2025-08-30 05:45:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.26e-05
2025-08-30 05:45:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:46:07 - pico-train - INFO - Step 44925 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:46:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0927
2025-08-30 05:46:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.26e-05
2025-08-30 05:46:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:46:20 - pico-train - INFO - Step 44950 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:46:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0458
2025-08-30 05:46:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.26e-05
2025-08-30 05:46:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:46:32 - pico-train - INFO - Step 44975 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:46:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0466
2025-08-30 05:46:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.26e-05
2025-08-30 05:46:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:46:44 - pico-train - INFO - Step 45000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 05:48:48 - pico-train - INFO - Step 45000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 05:48:48 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.3981708485109732e+28
2025-08-30 05:48:50 - pico-train - INFO - Step 45000 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:48:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0790
2025-08-30 05:48:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.26e-05
2025-08-30 05:48:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:48:50 - pico-train - INFO - Step 45000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 05:49:05 - pico-train - INFO - Step 45025 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:49:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0231
2025-08-30 05:49:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.25e-05
2025-08-30 05:49:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:49:17 - pico-train - INFO - Step 45050 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:49:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0257
2025-08-30 05:49:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.25e-05
2025-08-30 05:49:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:49:30 - pico-train - INFO - Step 45075 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:49:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0401
2025-08-30 05:49:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.25e-05
2025-08-30 05:49:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:49:43 - pico-train - INFO - Step 45100 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:49:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0050
2025-08-30 05:49:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.25e-05
2025-08-30 05:49:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:49:56 - pico-train - INFO - Step 45125 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:49:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0666
2025-08-30 05:49:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.25e-05
2025-08-30 05:49:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:50:08 - pico-train - INFO - Step 45150 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:50:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0214
2025-08-30 05:50:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.24e-05
2025-08-30 05:50:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:50:21 - pico-train - INFO - Step 45175 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:50:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1788
2025-08-30 05:50:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.24e-05
2025-08-30 05:50:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:50:33 - pico-train - INFO - Step 45200 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:50:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0156
2025-08-30 05:50:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.24e-05
2025-08-30 05:50:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:50:46 - pico-train - INFO - Step 45225 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:50:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0201
2025-08-30 05:50:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.24e-05
2025-08-30 05:50:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:50:59 - pico-train - INFO - Step 45250 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:50:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0011
2025-08-30 05:50:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.24e-05
2025-08-30 05:50:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:51:11 - pico-train - INFO - Step 45275 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:51:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1612
2025-08-30 05:51:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.23e-05
2025-08-30 05:51:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:51:24 - pico-train - INFO - Step 45300 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:51:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0480
2025-08-30 05:51:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.23e-05
2025-08-30 05:51:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:51:36 - pico-train - INFO - Step 45325 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:51:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9685
2025-08-30 05:51:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.23e-05
2025-08-30 05:51:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:51:49 - pico-train - INFO - Step 45350 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:51:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0803
2025-08-30 05:51:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.23e-05
2025-08-30 05:51:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:52:02 - pico-train - INFO - Step 45375 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:52:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0258
2025-08-30 05:52:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.23e-05
2025-08-30 05:52:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:52:14 - pico-train - INFO - Step 45400 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:52:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0367
2025-08-30 05:52:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.22e-05
2025-08-30 05:52:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:52:27 - pico-train - INFO - Step 45425 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:52:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9915
2025-08-30 05:52:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.22e-05
2025-08-30 05:52:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:52:39 - pico-train - INFO - Step 45450 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:52:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9926
2025-08-30 05:52:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.22e-05
2025-08-30 05:52:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:52:52 - pico-train - INFO - Step 45475 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:52:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9767
2025-08-30 05:52:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.22e-05
2025-08-30 05:52:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:53:04 - pico-train - INFO - Step 45500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 05:55:00 - pico-train - INFO - Step 45500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 05:55:00 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.1286507820171466e+28
2025-08-30 05:55:02 - pico-train - INFO - Step 45500 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:55:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0752
2025-08-30 05:55:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.22e-05
2025-08-30 05:55:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:55:02 - pico-train - INFO - Step 45500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 05:55:17 - pico-train - INFO - Step 45525 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:55:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0444
2025-08-30 05:55:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.21e-05
2025-08-30 05:55:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:55:29 - pico-train - INFO - Step 45550 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:55:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0119
2025-08-30 05:55:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.21e-05
2025-08-30 05:55:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:55:42 - pico-train - INFO - Step 45575 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:55:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0627
2025-08-30 05:55:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.21e-05
2025-08-30 05:55:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:55:55 - pico-train - INFO - Step 45600 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:55:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9389
2025-08-30 05:55:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.21e-05
2025-08-30 05:55:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:56:07 - pico-train - INFO - Step 45625 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:56:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1041
2025-08-30 05:56:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.21e-05
2025-08-30 05:56:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:56:20 - pico-train - INFO - Step 45650 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:56:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0837
2025-08-30 05:56:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.20e-05
2025-08-30 05:56:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:56:33 - pico-train - INFO - Step 45675 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:56:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0495
2025-08-30 05:56:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.20e-05
2025-08-30 05:56:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:56:45 - pico-train - INFO - Step 45700 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:56:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0507
2025-08-30 05:56:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.20e-05
2025-08-30 05:56:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:56:58 - pico-train - INFO - Step 45725 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:56:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0594
2025-08-30 05:56:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.20e-05
2025-08-30 05:56:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:57:11 - pico-train - INFO - Step 45750 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:57:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0685
2025-08-30 05:57:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.20e-05
2025-08-30 05:57:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:57:23 - pico-train - INFO - Step 45775 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:57:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0040
2025-08-30 05:57:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.19e-05
2025-08-30 05:57:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:57:36 - pico-train - INFO - Step 45800 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:57:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0630
2025-08-30 05:57:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.19e-05
2025-08-30 05:57:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:57:48 - pico-train - INFO - Step 45825 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:57:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0334
2025-08-30 05:57:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.19e-05
2025-08-30 05:57:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:58:01 - pico-train - INFO - Step 45850 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:58:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0141
2025-08-30 05:58:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.19e-05
2025-08-30 05:58:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:58:14 - pico-train - INFO - Step 45875 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:58:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0175
2025-08-30 05:58:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.18e-05
2025-08-30 05:58:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:58:27 - pico-train - INFO - Step 45900 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:58:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0745
2025-08-30 05:58:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.18e-05
2025-08-30 05:58:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:58:39 - pico-train - INFO - Step 45925 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:58:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0172
2025-08-30 05:58:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.18e-05
2025-08-30 05:58:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:58:52 - pico-train - INFO - Step 45950 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:58:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9627
2025-08-30 05:58:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.18e-05
2025-08-30 05:58:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:59:04 - pico-train - INFO - Step 45975 -- ๐Ÿ”„ Training Metrics
2025-08-30 05:59:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9906
2025-08-30 05:59:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.18e-05
2025-08-30 05:59:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 05:59:17 - pico-train - INFO - Step 46000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 06:01:12 - pico-train - INFO - Step 46000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 06:01:12 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.287805203128674e+28
2025-08-30 06:01:14 - pico-train - INFO - Step 46000 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:01:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0973
2025-08-30 06:01:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.17e-05
2025-08-30 06:01:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:01:14 - pico-train - INFO - Step 46000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 06:01:29 - pico-train - INFO - Step 46025 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:01:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9999
2025-08-30 06:01:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.17e-05
2025-08-30 06:01:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:01:41 - pico-train - INFO - Step 46050 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:01:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9786
2025-08-30 06:01:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.17e-05
2025-08-30 06:01:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:01:54 - pico-train - INFO - Step 46075 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:01:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0511
2025-08-30 06:01:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.17e-05
2025-08-30 06:01:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:02:06 - pico-train - INFO - Step 46100 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:02:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9915
2025-08-30 06:02:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.17e-05
2025-08-30 06:02:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:02:19 - pico-train - INFO - Step 46125 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:02:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0164
2025-08-30 06:02:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.16e-05
2025-08-30 06:02:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:02:32 - pico-train - INFO - Step 46150 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:02:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0278
2025-08-30 06:02:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.16e-05
2025-08-30 06:02:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:02:45 - pico-train - INFO - Step 46175 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:02:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9636
2025-08-30 06:02:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.16e-05
2025-08-30 06:02:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:02:57 - pico-train - INFO - Step 46200 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:02:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9233
2025-08-30 06:02:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.16e-05
2025-08-30 06:02:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:03:10 - pico-train - INFO - Step 46225 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:03:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1381
2025-08-30 06:03:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.16e-05
2025-08-30 06:03:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:03:23 - pico-train - INFO - Step 46250 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:03:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9423
2025-08-30 06:03:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.15e-05
2025-08-30 06:03:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:03:36 - pico-train - INFO - Step 46275 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:03:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9885
2025-08-30 06:03:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.15e-05
2025-08-30 06:03:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:03:48 - pico-train - INFO - Step 46300 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:03:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0572
2025-08-30 06:03:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.15e-05
2025-08-30 06:03:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:04:01 - pico-train - INFO - Step 46325 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:04:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0765
2025-08-30 06:04:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.15e-05
2025-08-30 06:04:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:04:13 - pico-train - INFO - Step 46350 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:04:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0594
2025-08-30 06:04:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.15e-05
2025-08-30 06:04:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:04:26 - pico-train - INFO - Step 46375 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:04:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0579
2025-08-30 06:04:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.14e-05
2025-08-30 06:04:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:04:39 - pico-train - INFO - Step 46400 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:04:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9964
2025-08-30 06:04:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.14e-05
2025-08-30 06:04:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:04:51 - pico-train - INFO - Step 46425 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:04:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0002
2025-08-30 06:04:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.14e-05
2025-08-30 06:04:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:05:04 - pico-train - INFO - Step 46450 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:05:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0970
2025-08-30 06:05:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.14e-05
2025-08-30 06:05:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:05:17 - pico-train - INFO - Step 46475 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:05:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9791
2025-08-30 06:05:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.14e-05
2025-08-30 06:05:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:05:29 - pico-train - INFO - Step 46500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 06:07:22 - pico-train - INFO - Step 46500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 06:07:22 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.5264771857772e+28
2025-08-30 06:07:35 - pico-train - INFO - Step 46500 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:07:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9970
2025-08-30 06:07:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.13e-05
2025-08-30 06:07:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:07:35 - pico-train - INFO - Step 46500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 06:07:50 - pico-train - INFO - Step 46525 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:07:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9723
2025-08-30 06:07:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.13e-05
2025-08-30 06:07:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:08:02 - pico-train - INFO - Step 46550 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:08:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9671
2025-08-30 06:08:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.13e-05
2025-08-30 06:08:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:08:15 - pico-train - INFO - Step 46575 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:08:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9461
2025-08-30 06:08:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.13e-05
2025-08-30 06:08:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:08:27 - pico-train - INFO - Step 46600 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:08:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0239
2025-08-30 06:08:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.13e-05
2025-08-30 06:08:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:08:40 - pico-train - INFO - Step 46625 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:08:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0496
2025-08-30 06:08:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.12e-05
2025-08-30 06:08:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:08:53 - pico-train - INFO - Step 46650 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:08:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9859
2025-08-30 06:08:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.12e-05
2025-08-30 06:08:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:09:05 - pico-train - INFO - Step 46675 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:09:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0529
2025-08-30 06:09:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.12e-05
2025-08-30 06:09:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:09:18 - pico-train - INFO - Step 46700 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:09:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0469
2025-08-30 06:09:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.12e-05
2025-08-30 06:09:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:09:31 - pico-train - INFO - Step 46725 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:09:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0152
2025-08-30 06:09:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.11e-05
2025-08-30 06:09:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:09:43 - pico-train - INFO - Step 46750 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:09:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0636
2025-08-30 06:09:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.11e-05
2025-08-30 06:09:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:09:56 - pico-train - INFO - Step 46775 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:09:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0503
2025-08-30 06:09:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.11e-05
2025-08-30 06:09:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:10:08 - pico-train - INFO - Step 46800 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:10:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0151
2025-08-30 06:10:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.11e-05
2025-08-30 06:10:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:10:21 - pico-train - INFO - Step 46825 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:10:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9617
2025-08-30 06:10:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.11e-05
2025-08-30 06:10:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:10:33 - pico-train - INFO - Step 46850 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:10:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9888
2025-08-30 06:10:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.10e-05
2025-08-30 06:10:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:10:46 - pico-train - INFO - Step 46875 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:10:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9116
2025-08-30 06:10:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.10e-05
2025-08-30 06:10:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:10:59 - pico-train - INFO - Step 46900 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:10:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0299
2025-08-30 06:10:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.10e-05
2025-08-30 06:10:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:11:11 - pico-train - INFO - Step 46925 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:11:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9876
2025-08-30 06:11:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.10e-05
2025-08-30 06:11:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:11:24 - pico-train - INFO - Step 46950 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:11:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0462
2025-08-30 06:11:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.10e-05
2025-08-30 06:11:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:11:36 - pico-train - INFO - Step 46975 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:11:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0083
2025-08-30 06:11:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.09e-05
2025-08-30 06:11:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:11:48 - pico-train - INFO - Step 47000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 06:13:47 - pico-train - INFO - Step 47000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 06:13:47 - pico-train - INFO - โ””โ”€โ”€ paloma: 3.374744437022473e+28
2025-08-30 06:13:49 - pico-train - INFO - Step 47000 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:13:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0269
2025-08-30 06:13:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.09e-05
2025-08-30 06:13:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:13:49 - pico-train - INFO - Step 47000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 06:14:04 - pico-train - INFO - Step 47025 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:14:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0510
2025-08-30 06:14:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.09e-05
2025-08-30 06:14:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:14:17 - pico-train - INFO - Step 47050 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:14:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9631
2025-08-30 06:14:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.09e-05
2025-08-30 06:14:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:14:29 - pico-train - INFO - Step 47075 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:14:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9767
2025-08-30 06:14:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.09e-05
2025-08-30 06:14:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:14:42 - pico-train - INFO - Step 47100 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:14:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0403
2025-08-30 06:14:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.08e-05
2025-08-30 06:14:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:14:55 - pico-train - INFO - Step 47125 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:14:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0179
2025-08-30 06:14:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.08e-05
2025-08-30 06:14:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:15:08 - pico-train - INFO - Step 47150 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:15:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0036
2025-08-30 06:15:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.08e-05
2025-08-30 06:15:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:15:21 - pico-train - INFO - Step 47175 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:15:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0186
2025-08-30 06:15:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.08e-05
2025-08-30 06:15:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:15:33 - pico-train - INFO - Step 47200 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:15:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9299
2025-08-30 06:15:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.08e-05
2025-08-30 06:15:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:15:46 - pico-train - INFO - Step 47225 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:15:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1006
2025-08-30 06:15:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.07e-05
2025-08-30 06:15:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:15:58 - pico-train - INFO - Step 47250 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:15:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9586
2025-08-30 06:15:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.07e-05
2025-08-30 06:15:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:16:11 - pico-train - INFO - Step 47275 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:16:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0152
2025-08-30 06:16:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.07e-05
2025-08-30 06:16:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:16:24 - pico-train - INFO - Step 47300 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:16:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9418
2025-08-30 06:16:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.07e-05
2025-08-30 06:16:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:16:36 - pico-train - INFO - Step 47325 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:16:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9040
2025-08-30 06:16:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.06e-05
2025-08-30 06:16:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:16:49 - pico-train - INFO - Step 47350 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:16:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0085
2025-08-30 06:16:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.06e-05
2025-08-30 06:16:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:17:01 - pico-train - INFO - Step 47375 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:17:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9546
2025-08-30 06:17:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.06e-05
2025-08-30 06:17:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:17:14 - pico-train - INFO - Step 47400 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:17:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0002
2025-08-30 06:17:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.06e-05
2025-08-30 06:17:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:17:26 - pico-train - INFO - Step 47425 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:17:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9671
2025-08-30 06:17:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.06e-05
2025-08-30 06:17:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:17:39 - pico-train - INFO - Step 47450 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:17:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9857
2025-08-30 06:17:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.05e-05
2025-08-30 06:17:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:17:52 - pico-train - INFO - Step 47475 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:17:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0252
2025-08-30 06:17:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.05e-05
2025-08-30 06:17:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:18:04 - pico-train - INFO - Step 47500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 06:19:56 - pico-train - INFO - Step 47500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 06:19:56 - pico-train - INFO - โ””โ”€โ”€ paloma: 6.3085366283161405e+28
2025-08-30 06:19:57 - pico-train - INFO - Step 47500 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:19:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0560
2025-08-30 06:19:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.05e-05
2025-08-30 06:19:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:19:57 - pico-train - INFO - Step 47500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 06:20:12 - pico-train - INFO - Step 47525 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:20:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9855
2025-08-30 06:20:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.05e-05
2025-08-30 06:20:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:20:25 - pico-train - INFO - Step 47550 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:20:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9577
2025-08-30 06:20:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.05e-05
2025-08-30 06:20:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:20:37 - pico-train - INFO - Step 47575 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:20:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0061
2025-08-30 06:20:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.04e-05
2025-08-30 06:20:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:20:50 - pico-train - INFO - Step 47600 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:20:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9977
2025-08-30 06:20:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.04e-05
2025-08-30 06:20:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:21:03 - pico-train - INFO - Step 47625 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:21:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9507
2025-08-30 06:21:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.04e-05
2025-08-30 06:21:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:21:15 - pico-train - INFO - Step 47650 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:21:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9363
2025-08-30 06:21:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.04e-05
2025-08-30 06:21:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:21:28 - pico-train - INFO - Step 47675 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:21:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0677
2025-08-30 06:21:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.04e-05
2025-08-30 06:21:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:21:41 - pico-train - INFO - Step 47700 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:21:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0777
2025-08-30 06:21:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.03e-05
2025-08-30 06:21:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:21:53 - pico-train - INFO - Step 47725 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:21:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9203
2025-08-30 06:21:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.03e-05
2025-08-30 06:21:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:22:06 - pico-train - INFO - Step 47750 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:22:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0014
2025-08-30 06:22:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.03e-05
2025-08-30 06:22:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:22:19 - pico-train - INFO - Step 47775 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:22:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9680
2025-08-30 06:22:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.03e-05
2025-08-30 06:22:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:22:31 - pico-train - INFO - Step 47800 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:22:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0516
2025-08-30 06:22:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.03e-05
2025-08-30 06:22:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:22:44 - pico-train - INFO - Step 47825 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:22:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0163
2025-08-30 06:22:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.02e-05
2025-08-30 06:22:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:22:56 - pico-train - INFO - Step 47850 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:22:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0132
2025-08-30 06:22:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.02e-05
2025-08-30 06:22:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:23:09 - pico-train - INFO - Step 47875 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:23:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9571
2025-08-30 06:23:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.02e-05
2025-08-30 06:23:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:23:22 - pico-train - INFO - Step 47900 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:23:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9390
2025-08-30 06:23:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.02e-05
2025-08-30 06:23:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:23:34 - pico-train - INFO - Step 47925 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:23:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9870
2025-08-30 06:23:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.01e-05
2025-08-30 06:23:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:23:47 - pico-train - INFO - Step 47950 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:23:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9717
2025-08-30 06:23:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.01e-05
2025-08-30 06:23:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:23:59 - pico-train - INFO - Step 47975 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:23:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0558
2025-08-30 06:23:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.01e-05
2025-08-30 06:23:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:24:12 - pico-train - INFO - Step 48000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 06:26:04 - pico-train - INFO - Step 48000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 06:26:04 - pico-train - INFO - โ””โ”€โ”€ paloma: 6.49975478431273e+28
2025-08-30 06:26:06 - pico-train - INFO - Step 48000 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:26:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0808
2025-08-30 06:26:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.01e-05
2025-08-30 06:26:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:26:06 - pico-train - INFO - Step 48000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 06:26:20 - pico-train - INFO - Step 48025 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:26:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0001
2025-08-30 06:26:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.01e-05
2025-08-30 06:26:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:26:33 - pico-train - INFO - Step 48050 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:26:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0349
2025-08-30 06:26:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.00e-05
2025-08-30 06:26:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:26:46 - pico-train - INFO - Step 48075 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:26:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9524
2025-08-30 06:26:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.00e-05
2025-08-30 06:26:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:26:58 - pico-train - INFO - Step 48100 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:26:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9626
2025-08-30 06:26:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.00e-05
2025-08-30 06:26:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:27:11 - pico-train - INFO - Step 48125 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:27:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0514
2025-08-30 06:27:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.00e-05
2025-08-30 06:27:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:27:24 - pico-train - INFO - Step 48150 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:27:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0687
2025-08-30 06:27:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 3.00e-05
2025-08-30 06:27:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:27:36 - pico-train - INFO - Step 48175 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:27:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0928
2025-08-30 06:27:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.99e-05
2025-08-30 06:27:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:27:49 - pico-train - INFO - Step 48200 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:27:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9182
2025-08-30 06:27:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.99e-05
2025-08-30 06:27:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:28:02 - pico-train - INFO - Step 48225 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:28:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9677
2025-08-30 06:28:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.99e-05
2025-08-30 06:28:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:28:14 - pico-train - INFO - Step 48250 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:28:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0330
2025-08-30 06:28:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.99e-05
2025-08-30 06:28:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:28:27 - pico-train - INFO - Step 48275 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:28:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0136
2025-08-30 06:28:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.99e-05
2025-08-30 06:28:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:28:39 - pico-train - INFO - Step 48300 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:28:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0606
2025-08-30 06:28:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.98e-05
2025-08-30 06:28:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:28:52 - pico-train - INFO - Step 48325 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:28:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9799
2025-08-30 06:28:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.98e-05
2025-08-30 06:28:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:29:04 - pico-train - INFO - Step 48350 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:29:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9201
2025-08-30 06:29:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.98e-05
2025-08-30 06:29:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:29:17 - pico-train - INFO - Step 48375 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:29:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0589
2025-08-30 06:29:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.98e-05
2025-08-30 06:29:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:29:30 - pico-train - INFO - Step 48400 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:29:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9895
2025-08-30 06:29:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.98e-05
2025-08-30 06:29:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:29:42 - pico-train - INFO - Step 48425 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:29:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0389
2025-08-30 06:29:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.97e-05
2025-08-30 06:29:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:29:55 - pico-train - INFO - Step 48450 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:29:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9910
2025-08-30 06:29:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.97e-05
2025-08-30 06:29:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:30:07 - pico-train - INFO - Step 48475 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:30:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0012
2025-08-30 06:30:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.97e-05
2025-08-30 06:30:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:30:20 - pico-train - INFO - Step 48500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 06:32:12 - pico-train - INFO - Step 48500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 06:32:12 - pico-train - INFO - โ””โ”€โ”€ paloma: 7.468048914747141e+28
2025-08-30 06:32:15 - pico-train - INFO - Step 48500 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:32:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0047
2025-08-30 06:32:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.97e-05
2025-08-30 06:32:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:32:15 - pico-train - INFO - Step 48500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 06:32:31 - pico-train - INFO - Step 48525 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:32:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9447
2025-08-30 06:32:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.96e-05
2025-08-30 06:32:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:32:43 - pico-train - INFO - Step 48550 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:32:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9573
2025-08-30 06:32:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.96e-05
2025-08-30 06:32:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:32:56 - pico-train - INFO - Step 48575 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:32:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9279
2025-08-30 06:32:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.96e-05
2025-08-30 06:32:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:33:08 - pico-train - INFO - Step 48600 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:33:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0511
2025-08-30 06:33:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.96e-05
2025-08-30 06:33:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:33:21 - pico-train - INFO - Step 48625 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:33:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9875
2025-08-30 06:33:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.96e-05
2025-08-30 06:33:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:33:34 - pico-train - INFO - Step 48650 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:33:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9392
2025-08-30 06:33:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.95e-05
2025-08-30 06:33:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:33:46 - pico-train - INFO - Step 48675 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:33:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9466
2025-08-30 06:33:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.95e-05
2025-08-30 06:33:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:33:59 - pico-train - INFO - Step 48700 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:33:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0769
2025-08-30 06:33:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.95e-05
2025-08-30 06:33:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:34:11 - pico-train - INFO - Step 48725 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:34:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8933
2025-08-30 06:34:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.95e-05
2025-08-30 06:34:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:34:24 - pico-train - INFO - Step 48750 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:34:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9891
2025-08-30 06:34:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.95e-05
2025-08-30 06:34:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:34:37 - pico-train - INFO - Step 48775 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:34:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9740
2025-08-30 06:34:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.94e-05
2025-08-30 06:34:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:34:49 - pico-train - INFO - Step 48800 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:34:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9417
2025-08-30 06:34:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.94e-05
2025-08-30 06:34:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:35:02 - pico-train - INFO - Step 48825 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:35:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9812
2025-08-30 06:35:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.94e-05
2025-08-30 06:35:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:35:14 - pico-train - INFO - Step 48850 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:35:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9183
2025-08-30 06:35:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.94e-05
2025-08-30 06:35:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:35:27 - pico-train - INFO - Step 48875 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:35:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8828
2025-08-30 06:35:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.94e-05
2025-08-30 06:35:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:35:40 - pico-train - INFO - Step 48900 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:35:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0054
2025-08-30 06:35:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.93e-05
2025-08-30 06:35:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:35:52 - pico-train - INFO - Step 48925 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:35:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9383
2025-08-30 06:35:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.93e-05
2025-08-30 06:35:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:36:05 - pico-train - INFO - Step 48950 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:36:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9938
2025-08-30 06:36:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.93e-05
2025-08-30 06:36:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:36:17 - pico-train - INFO - Step 48975 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:36:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0000
2025-08-30 06:36:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.93e-05
2025-08-30 06:36:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:36:30 - pico-train - INFO - Step 49000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 06:38:29 - pico-train - INFO - Step 49000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 06:38:29 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.0055567105609192e+29
2025-08-30 06:38:32 - pico-train - INFO - Step 49000 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:38:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9422
2025-08-30 06:38:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.92e-05
2025-08-30 06:38:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:38:32 - pico-train - INFO - Step 49000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 06:38:47 - pico-train - INFO - Step 49025 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:38:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0063
2025-08-30 06:38:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.92e-05
2025-08-30 06:38:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:39:00 - pico-train - INFO - Step 49050 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:39:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9355
2025-08-30 06:39:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.92e-05
2025-08-30 06:39:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:39:12 - pico-train - INFO - Step 49075 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:39:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9666
2025-08-30 06:39:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.92e-05
2025-08-30 06:39:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:39:25 - pico-train - INFO - Step 49100 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:39:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9422
2025-08-30 06:39:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.92e-05
2025-08-30 06:39:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:39:38 - pico-train - INFO - Step 49125 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:39:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0065
2025-08-30 06:39:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.91e-05
2025-08-30 06:39:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:39:51 - pico-train - INFO - Step 49150 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:39:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8978
2025-08-30 06:39:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.91e-05
2025-08-30 06:39:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:40:04 - pico-train - INFO - Step 49175 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:40:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9054
2025-08-30 06:40:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.91e-05
2025-08-30 06:40:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:40:16 - pico-train - INFO - Step 49200 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:40:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9853
2025-08-30 06:40:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.91e-05
2025-08-30 06:40:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:40:29 - pico-train - INFO - Step 49225 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:40:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.1100
2025-08-30 06:40:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.91e-05
2025-08-30 06:40:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:40:42 - pico-train - INFO - Step 49250 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:40:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9674
2025-08-30 06:40:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.90e-05
2025-08-30 06:40:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:40:54 - pico-train - INFO - Step 49275 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:40:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9658
2025-08-30 06:40:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.90e-05
2025-08-30 06:40:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:41:07 - pico-train - INFO - Step 49300 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:41:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8764
2025-08-30 06:41:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.90e-05
2025-08-30 06:41:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:41:19 - pico-train - INFO - Step 49325 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:41:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0664
2025-08-30 06:41:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.90e-05
2025-08-30 06:41:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:41:32 - pico-train - INFO - Step 49350 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:41:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0158
2025-08-30 06:41:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.90e-05
2025-08-30 06:41:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:41:44 - pico-train - INFO - Step 49375 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:41:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8884
2025-08-30 06:41:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.89e-05
2025-08-30 06:41:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:41:57 - pico-train - INFO - Step 49400 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:41:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9176
2025-08-30 06:41:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.89e-05
2025-08-30 06:41:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:42:10 - pico-train - INFO - Step 49425 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:42:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0363
2025-08-30 06:42:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.89e-05
2025-08-30 06:42:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:42:22 - pico-train - INFO - Step 49450 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:42:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9492
2025-08-30 06:42:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.89e-05
2025-08-30 06:42:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:42:35 - pico-train - INFO - Step 49475 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:42:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9670
2025-08-30 06:42:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.88e-05
2025-08-30 06:42:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:42:47 - pico-train - INFO - Step 49500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 06:44:54 - pico-train - INFO - Step 49500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 06:44:54 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.1754196849862509e+29
2025-08-30 06:44:57 - pico-train - INFO - Step 49500 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:44:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9626
2025-08-30 06:44:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.88e-05
2025-08-30 06:44:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:44:57 - pico-train - INFO - Step 49500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 06:45:12 - pico-train - INFO - Step 49525 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:45:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9286
2025-08-30 06:45:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.88e-05
2025-08-30 06:45:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:45:24 - pico-train - INFO - Step 49550 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:45:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8746
2025-08-30 06:45:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.88e-05
2025-08-30 06:45:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:45:37 - pico-train - INFO - Step 49575 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:45:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9669
2025-08-30 06:45:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.88e-05
2025-08-30 06:45:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:45:49 - pico-train - INFO - Step 49600 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:45:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0786
2025-08-30 06:45:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.87e-05
2025-08-30 06:45:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:46:03 - pico-train - INFO - Step 49625 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:46:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0407
2025-08-30 06:46:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.87e-05
2025-08-30 06:46:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:46:15 - pico-train - INFO - Step 49650 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:46:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9219
2025-08-30 06:46:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.87e-05
2025-08-30 06:46:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:46:28 - pico-train - INFO - Step 49675 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:46:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8997
2025-08-30 06:46:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.87e-05
2025-08-30 06:46:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:46:41 - pico-train - INFO - Step 49700 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:46:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9761
2025-08-30 06:46:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.87e-05
2025-08-30 06:46:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:46:53 - pico-train - INFO - Step 49725 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:46:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8749
2025-08-30 06:46:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.86e-05
2025-08-30 06:46:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:47:06 - pico-train - INFO - Step 49750 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:47:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9646
2025-08-30 06:47:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.86e-05
2025-08-30 06:47:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:47:18 - pico-train - INFO - Step 49775 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:47:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9586
2025-08-30 06:47:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.86e-05
2025-08-30 06:47:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:47:31 - pico-train - INFO - Step 49800 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:47:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9586
2025-08-30 06:47:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.86e-05
2025-08-30 06:47:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:47:44 - pico-train - INFO - Step 49825 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:47:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9795
2025-08-30 06:47:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.86e-05
2025-08-30 06:47:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:47:56 - pico-train - INFO - Step 49850 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:47:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9432
2025-08-30 06:47:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.85e-05
2025-08-30 06:47:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:48:09 - pico-train - INFO - Step 49875 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:48:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0729
2025-08-30 06:48:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.85e-05
2025-08-30 06:48:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:48:22 - pico-train - INFO - Step 49900 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:48:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0377
2025-08-30 06:48:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.85e-05
2025-08-30 06:48:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:48:34 - pico-train - INFO - Step 49925 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:48:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9408
2025-08-30 06:48:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.85e-05
2025-08-30 06:48:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:48:47 - pico-train - INFO - Step 49950 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:48:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9974
2025-08-30 06:48:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.84e-05
2025-08-30 06:48:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:48:59 - pico-train - INFO - Step 49975 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:48:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8671
2025-08-30 06:48:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.84e-05
2025-08-30 06:48:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:49:12 - pico-train - INFO - Step 50000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 06:51:08 - pico-train - INFO - Step 50000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 06:51:08 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.5844205816962802e+29
2025-08-30 06:51:11 - pico-train - INFO - Step 50000 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:51:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9467
2025-08-30 06:51:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.84e-05
2025-08-30 06:51:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:51:11 - pico-train - INFO - Step 50000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 06:51:26 - pico-train - INFO - Step 50025 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:51:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9961
2025-08-30 06:51:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.84e-05
2025-08-30 06:51:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:51:39 - pico-train - INFO - Step 50050 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:51:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9269
2025-08-30 06:51:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.84e-05
2025-08-30 06:51:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:51:52 - pico-train - INFO - Step 50075 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:51:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9394
2025-08-30 06:51:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.83e-05
2025-08-30 06:51:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:52:04 - pico-train - INFO - Step 50100 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:52:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9330
2025-08-30 06:52:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.83e-05
2025-08-30 06:52:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:52:18 - pico-train - INFO - Step 50125 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:52:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9620
2025-08-30 06:52:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.83e-05
2025-08-30 06:52:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:52:30 - pico-train - INFO - Step 50150 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:52:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0199
2025-08-30 06:52:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.83e-05
2025-08-30 06:52:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:52:43 - pico-train - INFO - Step 50175 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:52:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0399
2025-08-30 06:52:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.83e-05
2025-08-30 06:52:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:52:55 - pico-train - INFO - Step 50200 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:52:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0137
2025-08-30 06:52:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.82e-05
2025-08-30 06:52:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:53:08 - pico-train - INFO - Step 50225 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:53:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9405
2025-08-30 06:53:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.82e-05
2025-08-30 06:53:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:53:21 - pico-train - INFO - Step 50250 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:53:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9045
2025-08-30 06:53:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.82e-05
2025-08-30 06:53:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:53:33 - pico-train - INFO - Step 50275 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:53:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0237
2025-08-30 06:53:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.82e-05
2025-08-30 06:53:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:53:46 - pico-train - INFO - Step 50300 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:53:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9869
2025-08-30 06:53:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.82e-05
2025-08-30 06:53:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:53:59 - pico-train - INFO - Step 50325 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:53:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9344
2025-08-30 06:53:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.81e-05
2025-08-30 06:53:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:54:11 - pico-train - INFO - Step 50350 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:54:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0131
2025-08-30 06:54:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.81e-05
2025-08-30 06:54:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:54:24 - pico-train - INFO - Step 50375 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:54:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9916
2025-08-30 06:54:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.81e-05
2025-08-30 06:54:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:54:36 - pico-train - INFO - Step 50400 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:54:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0289
2025-08-30 06:54:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.81e-05
2025-08-30 06:54:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:54:49 - pico-train - INFO - Step 50425 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:54:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0051
2025-08-30 06:54:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.80e-05
2025-08-30 06:54:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:55:01 - pico-train - INFO - Step 50450 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:55:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9803
2025-08-30 06:55:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.80e-05
2025-08-30 06:55:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:55:14 - pico-train - INFO - Step 50475 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:55:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9222
2025-08-30 06:55:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.80e-05
2025-08-30 06:55:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:55:26 - pico-train - INFO - Step 50500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 06:57:22 - pico-train - INFO - Step 50500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 06:57:22 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.307126408238767e+29
2025-08-30 06:57:25 - pico-train - INFO - Step 50500 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:57:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9666
2025-08-30 06:57:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.80e-05
2025-08-30 06:57:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:57:25 - pico-train - INFO - Step 50500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 06:57:41 - pico-train - INFO - Step 50525 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:57:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9108
2025-08-30 06:57:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.80e-05
2025-08-30 06:57:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:57:54 - pico-train - INFO - Step 50550 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:57:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9066
2025-08-30 06:57:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.79e-05
2025-08-30 06:57:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:58:06 - pico-train - INFO - Step 50575 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:58:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9447
2025-08-30 06:58:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.79e-05
2025-08-30 06:58:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:58:19 - pico-train - INFO - Step 50600 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:58:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9920
2025-08-30 06:58:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.79e-05
2025-08-30 06:58:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:58:32 - pico-train - INFO - Step 50625 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:58:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8969
2025-08-30 06:58:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.79e-05
2025-08-30 06:58:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:58:44 - pico-train - INFO - Step 50650 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:58:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9352
2025-08-30 06:58:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.79e-05
2025-08-30 06:58:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:58:57 - pico-train - INFO - Step 50675 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:58:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9511
2025-08-30 06:58:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.78e-05
2025-08-30 06:58:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:59:10 - pico-train - INFO - Step 50700 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:59:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9762
2025-08-30 06:59:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.78e-05
2025-08-30 06:59:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:59:22 - pico-train - INFO - Step 50725 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:59:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8962
2025-08-30 06:59:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.78e-05
2025-08-30 06:59:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:59:35 - pico-train - INFO - Step 50750 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:59:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9610
2025-08-30 06:59:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.78e-05
2025-08-30 06:59:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 06:59:47 - pico-train - INFO - Step 50775 -- ๐Ÿ”„ Training Metrics
2025-08-30 06:59:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9507
2025-08-30 06:59:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.77e-05
2025-08-30 06:59:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:00:00 - pico-train - INFO - Step 50800 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:00:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0094
2025-08-30 07:00:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.77e-05
2025-08-30 07:00:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:00:13 - pico-train - INFO - Step 50825 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:00:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9076
2025-08-30 07:00:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.77e-05
2025-08-30 07:00:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:00:25 - pico-train - INFO - Step 50850 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:00:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9857
2025-08-30 07:00:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.77e-05
2025-08-30 07:00:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:00:38 - pico-train - INFO - Step 50875 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:00:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0201
2025-08-30 07:00:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.77e-05
2025-08-30 07:00:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:00:51 - pico-train - INFO - Step 50900 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:00:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0121
2025-08-30 07:00:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.76e-05
2025-08-30 07:00:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:01:03 - pico-train - INFO - Step 50925 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:01:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9658
2025-08-30 07:01:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.76e-05
2025-08-30 07:01:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:01:16 - pico-train - INFO - Step 50950 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:01:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9981
2025-08-30 07:01:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.76e-05
2025-08-30 07:01:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:01:28 - pico-train - INFO - Step 50975 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:01:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9961
2025-08-30 07:01:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.76e-05
2025-08-30 07:01:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:01:41 - pico-train - INFO - Step 51000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 07:03:38 - pico-train - INFO - Step 51000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 07:03:38 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.410761895962811e+29
2025-08-30 07:03:42 - pico-train - INFO - Step 51000 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:03:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9076
2025-08-30 07:03:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.76e-05
2025-08-30 07:03:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:03:42 - pico-train - INFO - Step 51000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 07:03:57 - pico-train - INFO - Step 51025 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:03:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9965
2025-08-30 07:03:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.75e-05
2025-08-30 07:03:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:04:09 - pico-train - INFO - Step 51050 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:04:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9633
2025-08-30 07:04:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.75e-05
2025-08-30 07:04:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:04:22 - pico-train - INFO - Step 51075 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:04:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9476
2025-08-30 07:04:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.75e-05
2025-08-30 07:04:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:04:34 - pico-train - INFO - Step 51100 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:04:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9797
2025-08-30 07:04:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.75e-05
2025-08-30 07:04:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:04:47 - pico-train - INFO - Step 51125 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:04:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9138
2025-08-30 07:04:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.75e-05
2025-08-30 07:04:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:04:59 - pico-train - INFO - Step 51150 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:04:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9946
2025-08-30 07:04:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.74e-05
2025-08-30 07:04:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:05:12 - pico-train - INFO - Step 51175 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:05:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9050
2025-08-30 07:05:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.74e-05
2025-08-30 07:05:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:05:25 - pico-train - INFO - Step 51200 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:05:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9431
2025-08-30 07:05:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.74e-05
2025-08-30 07:05:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:05:37 - pico-train - INFO - Step 51225 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:05:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9906
2025-08-30 07:05:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.74e-05
2025-08-30 07:05:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:05:50 - pico-train - INFO - Step 51250 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:05:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9408
2025-08-30 07:05:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.73e-05
2025-08-30 07:05:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:06:03 - pico-train - INFO - Step 51275 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:06:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0058
2025-08-30 07:06:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.73e-05
2025-08-30 07:06:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:06:15 - pico-train - INFO - Step 51300 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:06:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9526
2025-08-30 07:06:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.73e-05
2025-08-30 07:06:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:06:28 - pico-train - INFO - Step 51325 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:06:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9452
2025-08-30 07:06:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.73e-05
2025-08-30 07:06:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:06:40 - pico-train - INFO - Step 51350 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:06:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0049
2025-08-30 07:06:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.73e-05
2025-08-30 07:06:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:06:53 - pico-train - INFO - Step 51375 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:06:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9591
2025-08-30 07:06:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.72e-05
2025-08-30 07:06:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:07:06 - pico-train - INFO - Step 51400 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:07:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9947
2025-08-30 07:07:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.72e-05
2025-08-30 07:07:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:07:18 - pico-train - INFO - Step 51425 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:07:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9487
2025-08-30 07:07:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.72e-05
2025-08-30 07:07:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:07:31 - pico-train - INFO - Step 51450 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:07:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9444
2025-08-30 07:07:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.72e-05
2025-08-30 07:07:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:07:44 - pico-train - INFO - Step 51475 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:07:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9784
2025-08-30 07:07:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.72e-05
2025-08-30 07:07:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:07:56 - pico-train - INFO - Step 51500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 07:09:51 - pico-train - INFO - Step 51500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 07:09:51 - pico-train - INFO - โ””โ”€โ”€ paloma: 3.779255466147184e+29
2025-08-30 07:09:53 - pico-train - INFO - Step 51500 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:09:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9132
2025-08-30 07:09:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.71e-05
2025-08-30 07:09:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:09:53 - pico-train - INFO - Step 51500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 07:10:09 - pico-train - INFO - Step 51525 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:10:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9536
2025-08-30 07:10:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.71e-05
2025-08-30 07:10:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:10:21 - pico-train - INFO - Step 51550 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:10:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9491
2025-08-30 07:10:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.71e-05
2025-08-30 07:10:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:10:34 - pico-train - INFO - Step 51575 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:10:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9868
2025-08-30 07:10:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.71e-05
2025-08-30 07:10:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:10:47 - pico-train - INFO - Step 51600 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:10:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9299
2025-08-30 07:10:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.70e-05
2025-08-30 07:10:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:10:59 - pico-train - INFO - Step 51625 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:10:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9520
2025-08-30 07:10:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.70e-05
2025-08-30 07:10:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:11:12 - pico-train - INFO - Step 51650 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:11:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8812
2025-08-30 07:11:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.70e-05
2025-08-30 07:11:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:11:24 - pico-train - INFO - Step 51675 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:11:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9874
2025-08-30 07:11:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.70e-05
2025-08-30 07:11:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:11:37 - pico-train - INFO - Step 51700 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:11:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8259
2025-08-30 07:11:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.70e-05
2025-08-30 07:11:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:11:50 - pico-train - INFO - Step 51725 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:11:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8867
2025-08-30 07:11:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.69e-05
2025-08-30 07:11:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:12:03 - pico-train - INFO - Step 51750 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:12:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9863
2025-08-30 07:12:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.69e-05
2025-08-30 07:12:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:12:16 - pico-train - INFO - Step 51775 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:12:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0154
2025-08-30 07:12:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.69e-05
2025-08-30 07:12:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:12:28 - pico-train - INFO - Step 51800 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:12:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9222
2025-08-30 07:12:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.69e-05
2025-08-30 07:12:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:12:41 - pico-train - INFO - Step 51825 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:12:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9468
2025-08-30 07:12:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.69e-05
2025-08-30 07:12:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:12:53 - pico-train - INFO - Step 51850 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:12:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9967
2025-08-30 07:12:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.68e-05
2025-08-30 07:12:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:13:06 - pico-train - INFO - Step 51875 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:13:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9565
2025-08-30 07:13:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.68e-05
2025-08-30 07:13:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:13:19 - pico-train - INFO - Step 51900 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:13:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9186
2025-08-30 07:13:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.68e-05
2025-08-30 07:13:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:13:31 - pico-train - INFO - Step 51925 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:13:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7959
2025-08-30 07:13:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.68e-05
2025-08-30 07:13:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:13:44 - pico-train - INFO - Step 51950 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:13:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9955
2025-08-30 07:13:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.67e-05
2025-08-30 07:13:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:13:56 - pico-train - INFO - Step 51975 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:13:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9673
2025-08-30 07:13:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.67e-05
2025-08-30 07:13:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:14:09 - pico-train - INFO - Step 52000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 07:16:04 - pico-train - INFO - Step 52000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 07:16:04 - pico-train - INFO - โ””โ”€โ”€ paloma: 4.093974189838008e+29
2025-08-30 07:16:07 - pico-train - INFO - Step 52000 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:16:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0273
2025-08-30 07:16:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.67e-05
2025-08-30 07:16:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:16:07 - pico-train - INFO - Step 52000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 07:16:23 - pico-train - INFO - Step 52025 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:16:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8760
2025-08-30 07:16:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.67e-05
2025-08-30 07:16:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:16:36 - pico-train - INFO - Step 52050 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:16:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9087
2025-08-30 07:16:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.67e-05
2025-08-30 07:16:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:16:48 - pico-train - INFO - Step 52075 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:16:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8656
2025-08-30 07:16:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.66e-05
2025-08-30 07:16:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:17:01 - pico-train - INFO - Step 52100 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:17:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9358
2025-08-30 07:17:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.66e-05
2025-08-30 07:17:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:17:13 - pico-train - INFO - Step 52125 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:17:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0056
2025-08-30 07:17:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.66e-05
2025-08-30 07:17:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:17:26 - pico-train - INFO - Step 52150 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:17:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9770
2025-08-30 07:17:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.66e-05
2025-08-30 07:17:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:17:38 - pico-train - INFO - Step 52175 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:17:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9145
2025-08-30 07:17:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.66e-05
2025-08-30 07:17:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:17:51 - pico-train - INFO - Step 52200 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:17:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9592
2025-08-30 07:17:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.65e-05
2025-08-30 07:17:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:18:04 - pico-train - INFO - Step 52225 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:18:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9323
2025-08-30 07:18:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.65e-05
2025-08-30 07:18:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:18:17 - pico-train - INFO - Step 52250 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:18:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9309
2025-08-30 07:18:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.65e-05
2025-08-30 07:18:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:18:29 - pico-train - INFO - Step 52275 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:18:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0290
2025-08-30 07:18:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.65e-05
2025-08-30 07:18:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:18:42 - pico-train - INFO - Step 52300 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:18:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0121
2025-08-30 07:18:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.65e-05
2025-08-30 07:18:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:18:55 - pico-train - INFO - Step 52325 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:18:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8936
2025-08-30 07:18:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.64e-05
2025-08-30 07:18:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:19:07 - pico-train - INFO - Step 52350 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:19:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9461
2025-08-30 07:19:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.64e-05
2025-08-30 07:19:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:19:20 - pico-train - INFO - Step 52375 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:19:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0288
2025-08-30 07:19:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.64e-05
2025-08-30 07:19:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:19:32 - pico-train - INFO - Step 52400 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:19:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9728
2025-08-30 07:19:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.64e-05
2025-08-30 07:19:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:19:45 - pico-train - INFO - Step 52425 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:19:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8944
2025-08-30 07:19:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.63e-05
2025-08-30 07:19:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:19:58 - pico-train - INFO - Step 52450 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:19:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0148
2025-08-30 07:19:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.63e-05
2025-08-30 07:19:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:20:10 - pico-train - INFO - Step 52475 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:20:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9232
2025-08-30 07:20:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.63e-05
2025-08-30 07:20:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:20:22 - pico-train - INFO - Step 52500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 07:22:31 - pico-train - INFO - Step 52500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 07:22:31 - pico-train - INFO - โ””โ”€โ”€ paloma: 4.3829741549987764e+29
2025-08-30 07:22:34 - pico-train - INFO - Step 52500 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:22:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0213
2025-08-30 07:22:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.63e-05
2025-08-30 07:22:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:22:34 - pico-train - INFO - Step 52500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 07:22:49 - pico-train - INFO - Step 52525 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:22:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9703
2025-08-30 07:22:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.63e-05
2025-08-30 07:22:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:23:01 - pico-train - INFO - Step 52550 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:23:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9471
2025-08-30 07:23:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.62e-05
2025-08-30 07:23:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:23:14 - pico-train - INFO - Step 52575 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:23:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9502
2025-08-30 07:23:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.62e-05
2025-08-30 07:23:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:23:26 - pico-train - INFO - Step 52600 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:23:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9032
2025-08-30 07:23:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.62e-05
2025-08-30 07:23:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:23:39 - pico-train - INFO - Step 52625 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:23:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8965
2025-08-30 07:23:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.62e-05
2025-08-30 07:23:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:23:51 - pico-train - INFO - Step 52650 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:23:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9213
2025-08-30 07:23:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.62e-05
2025-08-30 07:23:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:24:04 - pico-train - INFO - Step 52675 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:24:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0115
2025-08-30 07:24:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.61e-05
2025-08-30 07:24:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:24:17 - pico-train - INFO - Step 52700 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:24:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9340
2025-08-30 07:24:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.61e-05
2025-08-30 07:24:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:24:29 - pico-train - INFO - Step 52725 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:24:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8320
2025-08-30 07:24:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.61e-05
2025-08-30 07:24:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:24:42 - pico-train - INFO - Step 52750 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:24:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9125
2025-08-30 07:24:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.61e-05
2025-08-30 07:24:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:24:55 - pico-train - INFO - Step 52775 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:24:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8468
2025-08-30 07:24:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.60e-05
2025-08-30 07:24:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:25:07 - pico-train - INFO - Step 52800 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:25:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9822
2025-08-30 07:25:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.60e-05
2025-08-30 07:25:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:25:20 - pico-train - INFO - Step 52825 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:25:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0151
2025-08-30 07:25:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.60e-05
2025-08-30 07:25:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:25:33 - pico-train - INFO - Step 52850 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:25:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9894
2025-08-30 07:25:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.60e-05
2025-08-30 07:25:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:25:45 - pico-train - INFO - Step 52875 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:25:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8899
2025-08-30 07:25:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.60e-05
2025-08-30 07:25:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:25:58 - pico-train - INFO - Step 52900 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:25:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8752
2025-08-30 07:25:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.59e-05
2025-08-30 07:25:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:26:11 - pico-train - INFO - Step 52925 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:26:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9977
2025-08-30 07:26:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.59e-05
2025-08-30 07:26:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:26:23 - pico-train - INFO - Step 52950 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:26:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9522
2025-08-30 07:26:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.59e-05
2025-08-30 07:26:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:26:36 - pico-train - INFO - Step 52975 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:26:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9471
2025-08-30 07:26:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.59e-05
2025-08-30 07:26:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:26:48 - pico-train - INFO - Step 53000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 07:28:43 - pico-train - INFO - Step 53000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 07:28:43 - pico-train - INFO - โ””โ”€โ”€ paloma: 6.0678060947205406e+29
2025-08-30 07:28:45 - pico-train - INFO - Step 53000 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:28:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9668
2025-08-30 07:28:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.59e-05
2025-08-30 07:28:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:28:45 - pico-train - INFO - Step 53000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 07:29:00 - pico-train - INFO - Step 53025 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:29:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9943
2025-08-30 07:29:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.58e-05
2025-08-30 07:29:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:29:13 - pico-train - INFO - Step 53050 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:29:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9316
2025-08-30 07:29:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.58e-05
2025-08-30 07:29:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:29:25 - pico-train - INFO - Step 53075 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:29:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9268
2025-08-30 07:29:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.58e-05
2025-08-30 07:29:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:29:38 - pico-train - INFO - Step 53100 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:29:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9084
2025-08-30 07:29:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.58e-05
2025-08-30 07:29:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:29:51 - pico-train - INFO - Step 53125 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:29:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0159
2025-08-30 07:29:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.57e-05
2025-08-30 07:29:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:30:03 - pico-train - INFO - Step 53150 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:30:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8973
2025-08-30 07:30:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.57e-05
2025-08-30 07:30:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:30:16 - pico-train - INFO - Step 53175 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:30:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9660
2025-08-30 07:30:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.57e-05
2025-08-30 07:30:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:30:29 - pico-train - INFO - Step 53200 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:30:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9716
2025-08-30 07:30:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.57e-05
2025-08-30 07:30:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:30:41 - pico-train - INFO - Step 53225 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:30:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8883
2025-08-30 07:30:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.57e-05
2025-08-30 07:30:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:30:54 - pico-train - INFO - Step 53250 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:30:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9727
2025-08-30 07:30:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.56e-05
2025-08-30 07:30:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:31:07 - pico-train - INFO - Step 53275 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:31:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8948
2025-08-30 07:31:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.56e-05
2025-08-30 07:31:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:31:19 - pico-train - INFO - Step 53300 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:31:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8979
2025-08-30 07:31:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.56e-05
2025-08-30 07:31:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:31:32 - pico-train - INFO - Step 53325 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:31:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9572
2025-08-30 07:31:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.56e-05
2025-08-30 07:31:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:31:44 - pico-train - INFO - Step 53350 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:31:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8599
2025-08-30 07:31:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.56e-05
2025-08-30 07:31:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:31:57 - pico-train - INFO - Step 53375 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:31:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8751
2025-08-30 07:31:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.55e-05
2025-08-30 07:31:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:32:09 - pico-train - INFO - Step 53400 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:32:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9950
2025-08-30 07:32:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.55e-05
2025-08-30 07:32:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:32:22 - pico-train - INFO - Step 53425 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:32:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9827
2025-08-30 07:32:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.55e-05
2025-08-30 07:32:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:32:34 - pico-train - INFO - Step 53450 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:32:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8589
2025-08-30 07:32:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.55e-05
2025-08-30 07:32:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:32:47 - pico-train - INFO - Step 53475 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:32:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9415
2025-08-30 07:32:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.54e-05
2025-08-30 07:32:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:32:59 - pico-train - INFO - Step 53500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 07:34:53 - pico-train - INFO - Step 53500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 07:34:53 - pico-train - INFO - โ””โ”€โ”€ paloma: 5.560195307890516e+29
2025-08-30 07:34:55 - pico-train - INFO - Step 53500 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:34:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8976
2025-08-30 07:34:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.54e-05
2025-08-30 07:34:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:34:55 - pico-train - INFO - Step 53500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 07:35:10 - pico-train - INFO - Step 53525 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:35:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9070
2025-08-30 07:35:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.54e-05
2025-08-30 07:35:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:35:22 - pico-train - INFO - Step 53550 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:35:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8362
2025-08-30 07:35:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.54e-05
2025-08-30 07:35:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:35:35 - pico-train - INFO - Step 53575 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:35:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8874
2025-08-30 07:35:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.54e-05
2025-08-30 07:35:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:35:48 - pico-train - INFO - Step 53600 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:35:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8866
2025-08-30 07:35:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.53e-05
2025-08-30 07:35:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:36:00 - pico-train - INFO - Step 53625 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:36:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8824
2025-08-30 07:36:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.53e-05
2025-08-30 07:36:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:36:13 - pico-train - INFO - Step 53650 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:36:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7949
2025-08-30 07:36:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.53e-05
2025-08-30 07:36:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:36:25 - pico-train - INFO - Step 53675 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:36:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9849
2025-08-30 07:36:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.53e-05
2025-08-30 07:36:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:36:38 - pico-train - INFO - Step 53700 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:36:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9197
2025-08-30 07:36:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.53e-05
2025-08-30 07:36:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:36:50 - pico-train - INFO - Step 53725 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:36:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9326
2025-08-30 07:36:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.52e-05
2025-08-30 07:36:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:37:03 - pico-train - INFO - Step 53750 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:37:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8980
2025-08-30 07:37:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.52e-05
2025-08-30 07:37:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:37:16 - pico-train - INFO - Step 53775 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:37:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8599
2025-08-30 07:37:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.52e-05
2025-08-30 07:37:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:37:28 - pico-train - INFO - Step 53800 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:37:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8844
2025-08-30 07:37:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.52e-05
2025-08-30 07:37:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:37:41 - pico-train - INFO - Step 53825 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:37:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9178
2025-08-30 07:37:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.51e-05
2025-08-30 07:37:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:37:54 - pico-train - INFO - Step 53850 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:37:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9118
2025-08-30 07:37:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.51e-05
2025-08-30 07:37:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:38:06 - pico-train - INFO - Step 53875 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:38:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9270
2025-08-30 07:38:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.51e-05
2025-08-30 07:38:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:38:19 - pico-train - INFO - Step 53900 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:38:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8265
2025-08-30 07:38:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.51e-05
2025-08-30 07:38:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:38:31 - pico-train - INFO - Step 53925 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:38:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9337
2025-08-30 07:38:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.51e-05
2025-08-30 07:38:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:38:44 - pico-train - INFO - Step 53950 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:38:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8446
2025-08-30 07:38:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.50e-05
2025-08-30 07:38:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:38:57 - pico-train - INFO - Step 53975 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:38:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8990
2025-08-30 07:38:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.50e-05
2025-08-30 07:38:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:39:09 - pico-train - INFO - Step 54000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 07:41:23 - pico-train - INFO - Step 54000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 07:41:23 - pico-train - INFO - โ””โ”€โ”€ paloma: 7.742991230238928e+29
2025-08-30 07:41:25 - pico-train - INFO - Step 54000 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:41:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7472
2025-08-30 07:41:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.50e-05
2025-08-30 07:41:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:41:25 - pico-train - INFO - Step 54000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 07:41:40 - pico-train - INFO - Step 54025 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:41:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9241
2025-08-30 07:41:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.50e-05
2025-08-30 07:41:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:41:53 - pico-train - INFO - Step 54050 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:41:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9308
2025-08-30 07:41:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.50e-05
2025-08-30 07:41:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:42:05 - pico-train - INFO - Step 54075 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:42:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9971
2025-08-30 07:42:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.49e-05
2025-08-30 07:42:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:42:18 - pico-train - INFO - Step 54100 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:42:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9050
2025-08-30 07:42:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.49e-05
2025-08-30 07:42:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:42:31 - pico-train - INFO - Step 54125 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:42:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0286
2025-08-30 07:42:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.49e-05
2025-08-30 07:42:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:42:43 - pico-train - INFO - Step 54150 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:42:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9402
2025-08-30 07:42:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.49e-05
2025-08-30 07:42:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:42:56 - pico-train - INFO - Step 54175 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:42:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8842
2025-08-30 07:42:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.49e-05
2025-08-30 07:42:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:43:08 - pico-train - INFO - Step 54200 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:43:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9398
2025-08-30 07:43:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.48e-05
2025-08-30 07:43:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:43:21 - pico-train - INFO - Step 54225 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:43:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8637
2025-08-30 07:43:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.48e-05
2025-08-30 07:43:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:43:34 - pico-train - INFO - Step 54250 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:43:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8928
2025-08-30 07:43:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.48e-05
2025-08-30 07:43:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:43:47 - pico-train - INFO - Step 54275 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:43:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8749
2025-08-30 07:43:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.48e-05
2025-08-30 07:43:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:43:59 - pico-train - INFO - Step 54300 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:43:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8800
2025-08-30 07:43:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.47e-05
2025-08-30 07:43:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:44:12 - pico-train - INFO - Step 54325 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:44:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9748
2025-08-30 07:44:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.47e-05
2025-08-30 07:44:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:44:25 - pico-train - INFO - Step 54350 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:44:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8758
2025-08-30 07:44:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.47e-05
2025-08-30 07:44:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:44:37 - pico-train - INFO - Step 54375 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:44:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0149
2025-08-30 07:44:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.47e-05
2025-08-30 07:44:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:44:50 - pico-train - INFO - Step 54400 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:44:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9165
2025-08-30 07:44:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.47e-05
2025-08-30 07:44:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:45:02 - pico-train - INFO - Step 54425 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:45:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8508
2025-08-30 07:45:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.46e-05
2025-08-30 07:45:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:45:15 - pico-train - INFO - Step 54450 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:45:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9284
2025-08-30 07:45:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.46e-05
2025-08-30 07:45:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:45:27 - pico-train - INFO - Step 54475 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:45:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9071
2025-08-30 07:45:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.46e-05
2025-08-30 07:45:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:45:39 - pico-train - INFO - Step 54500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 07:47:45 - pico-train - INFO - Step 54500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 07:47:45 - pico-train - INFO - โ””โ”€โ”€ paloma: 9.839335327293338e+29
2025-08-30 07:47:47 - pico-train - INFO - Step 54500 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:47:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8753
2025-08-30 07:47:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.46e-05
2025-08-30 07:47:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:47:47 - pico-train - INFO - Step 54500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 07:48:03 - pico-train - INFO - Step 54525 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:48:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9132
2025-08-30 07:48:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.46e-05
2025-08-30 07:48:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:48:16 - pico-train - INFO - Step 54550 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:48:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9826
2025-08-30 07:48:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.45e-05
2025-08-30 07:48:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:48:28 - pico-train - INFO - Step 54575 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:48:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8963
2025-08-30 07:48:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.45e-05
2025-08-30 07:48:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:48:41 - pico-train - INFO - Step 54600 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:48:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9433
2025-08-30 07:48:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.45e-05
2025-08-30 07:48:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:48:53 - pico-train - INFO - Step 54625 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:48:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9281
2025-08-30 07:48:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.45e-05
2025-08-30 07:48:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:49:06 - pico-train - INFO - Step 54650 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:49:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8462
2025-08-30 07:49:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.44e-05
2025-08-30 07:49:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:49:19 - pico-train - INFO - Step 54675 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:49:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9508
2025-08-30 07:49:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.44e-05
2025-08-30 07:49:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:49:31 - pico-train - INFO - Step 54700 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:49:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8880
2025-08-30 07:49:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.44e-05
2025-08-30 07:49:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:49:44 - pico-train - INFO - Step 54725 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:49:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8829
2025-08-30 07:49:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.44e-05
2025-08-30 07:49:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:49:57 - pico-train - INFO - Step 54750 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:49:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9466
2025-08-30 07:49:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.44e-05
2025-08-30 07:49:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:50:10 - pico-train - INFO - Step 54775 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:50:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9607
2025-08-30 07:50:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.43e-05
2025-08-30 07:50:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:50:22 - pico-train - INFO - Step 54800 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:50:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9967
2025-08-30 07:50:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.43e-05
2025-08-30 07:50:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:50:35 - pico-train - INFO - Step 54825 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:50:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8599
2025-08-30 07:50:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.43e-05
2025-08-30 07:50:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:50:48 - pico-train - INFO - Step 54850 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:50:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9756
2025-08-30 07:50:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.43e-05
2025-08-30 07:50:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:51:00 - pico-train - INFO - Step 54875 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:51:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8856
2025-08-30 07:51:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.43e-05
2025-08-30 07:51:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:51:13 - pico-train - INFO - Step 54900 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:51:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9306
2025-08-30 07:51:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.42e-05
2025-08-30 07:51:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:51:26 - pico-train - INFO - Step 54925 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:51:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0266
2025-08-30 07:51:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.42e-05
2025-08-30 07:51:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:51:38 - pico-train - INFO - Step 54950 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:51:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9054
2025-08-30 07:51:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.42e-05
2025-08-30 07:51:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:51:51 - pico-train - INFO - Step 54975 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:51:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8885
2025-08-30 07:51:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.42e-05
2025-08-30 07:51:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:52:03 - pico-train - INFO - Step 55000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 07:54:06 - pico-train - INFO - Step 55000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 07:54:06 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.0447307155558866e+30
2025-08-30 07:54:07 - pico-train - INFO - Step 55000 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:54:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0147
2025-08-30 07:54:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.41e-05
2025-08-30 07:54:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:54:07 - pico-train - INFO - Step 55000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 07:54:22 - pico-train - INFO - Step 55025 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:54:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9628
2025-08-30 07:54:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.41e-05
2025-08-30 07:54:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:54:34 - pico-train - INFO - Step 55050 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:54:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9456
2025-08-30 07:54:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.41e-05
2025-08-30 07:54:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:54:47 - pico-train - INFO - Step 55075 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:54:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9061
2025-08-30 07:54:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.41e-05
2025-08-30 07:54:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:55:00 - pico-train - INFO - Step 55100 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:55:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9604
2025-08-30 07:55:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.41e-05
2025-08-30 07:55:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:55:12 - pico-train - INFO - Step 55125 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:55:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8649
2025-08-30 07:55:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.40e-05
2025-08-30 07:55:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:55:25 - pico-train - INFO - Step 55150 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:55:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8123
2025-08-30 07:55:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.40e-05
2025-08-30 07:55:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:55:37 - pico-train - INFO - Step 55175 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:55:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9016
2025-08-30 07:55:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.40e-05
2025-08-30 07:55:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:55:50 - pico-train - INFO - Step 55200 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:55:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9233
2025-08-30 07:55:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.40e-05
2025-08-30 07:55:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:56:03 - pico-train - INFO - Step 55225 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:56:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8768
2025-08-30 07:56:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.40e-05
2025-08-30 07:56:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:56:15 - pico-train - INFO - Step 55250 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:56:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9265
2025-08-30 07:56:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.39e-05
2025-08-30 07:56:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:56:28 - pico-train - INFO - Step 55275 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:56:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8929
2025-08-30 07:56:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.39e-05
2025-08-30 07:56:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:56:41 - pico-train - INFO - Step 55300 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:56:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8863
2025-08-30 07:56:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.39e-05
2025-08-30 07:56:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:56:53 - pico-train - INFO - Step 55325 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:56:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8588
2025-08-30 07:56:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.39e-05
2025-08-30 07:56:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:57:06 - pico-train - INFO - Step 55350 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:57:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8740
2025-08-30 07:57:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.38e-05
2025-08-30 07:57:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:57:19 - pico-train - INFO - Step 55375 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:57:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9163
2025-08-30 07:57:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.38e-05
2025-08-30 07:57:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:57:32 - pico-train - INFO - Step 55400 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:57:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8563
2025-08-30 07:57:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.38e-05
2025-08-30 07:57:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:57:44 - pico-train - INFO - Step 55425 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:57:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8964
2025-08-30 07:57:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.38e-05
2025-08-30 07:57:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:57:57 - pico-train - INFO - Step 55450 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:57:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9860
2025-08-30 07:57:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.38e-05
2025-08-30 07:57:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:58:09 - pico-train - INFO - Step 55475 -- ๐Ÿ”„ Training Metrics
2025-08-30 07:58:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9124
2025-08-30 07:58:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.37e-05
2025-08-30 07:58:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 07:58:22 - pico-train - INFO - Step 55500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 08:00:23 - pico-train - INFO - Step 55500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 08:00:23 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.3871906758809066e+30
2025-08-30 08:00:33 - pico-train - INFO - Step 55500 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:00:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9344
2025-08-30 08:00:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.37e-05
2025-08-30 08:00:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:00:33 - pico-train - INFO - Step 55500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 08:00:48 - pico-train - INFO - Step 55525 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:00:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8185
2025-08-30 08:00:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.37e-05
2025-08-30 08:00:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:01:00 - pico-train - INFO - Step 55550 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:01:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8907
2025-08-30 08:01:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.37e-05
2025-08-30 08:01:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:01:13 - pico-train - INFO - Step 55575 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:01:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8578
2025-08-30 08:01:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.37e-05
2025-08-30 08:01:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:01:25 - pico-train - INFO - Step 55600 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:01:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8699
2025-08-30 08:01:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.36e-05
2025-08-30 08:01:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:01:38 - pico-train - INFO - Step 55625 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:01:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9257
2025-08-30 08:01:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.36e-05
2025-08-30 08:01:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:01:50 - pico-train - INFO - Step 55650 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:01:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9346
2025-08-30 08:01:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.36e-05
2025-08-30 08:01:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:02:03 - pico-train - INFO - Step 55675 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:02:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9879
2025-08-30 08:02:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.36e-05
2025-08-30 08:02:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:02:15 - pico-train - INFO - Step 55700 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:02:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9003
2025-08-30 08:02:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.35e-05
2025-08-30 08:02:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:02:28 - pico-train - INFO - Step 55725 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:02:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9490
2025-08-30 08:02:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.35e-05
2025-08-30 08:02:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:02:40 - pico-train - INFO - Step 55750 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:02:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8409
2025-08-30 08:02:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.35e-05
2025-08-30 08:02:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:02:53 - pico-train - INFO - Step 55775 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:02:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9248
2025-08-30 08:02:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.35e-05
2025-08-30 08:02:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:03:06 - pico-train - INFO - Step 55800 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:03:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8427
2025-08-30 08:03:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.35e-05
2025-08-30 08:03:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:03:18 - pico-train - INFO - Step 55825 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:03:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9812
2025-08-30 08:03:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.34e-05
2025-08-30 08:03:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:03:31 - pico-train - INFO - Step 55850 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:03:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8846
2025-08-30 08:03:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.34e-05
2025-08-30 08:03:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:03:44 - pico-train - INFO - Step 55875 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:03:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8634
2025-08-30 08:03:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.34e-05
2025-08-30 08:03:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:03:56 - pico-train - INFO - Step 55900 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:03:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8900
2025-08-30 08:03:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.34e-05
2025-08-30 08:03:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:04:09 - pico-train - INFO - Step 55925 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:04:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8378
2025-08-30 08:04:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.34e-05
2025-08-30 08:04:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:04:21 - pico-train - INFO - Step 55950 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:04:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8298
2025-08-30 08:04:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.33e-05
2025-08-30 08:04:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:04:34 - pico-train - INFO - Step 55975 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:04:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0091
2025-08-30 08:04:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.33e-05
2025-08-30 08:04:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:04:46 - pico-train - INFO - Step 56000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 08:06:44 - pico-train - INFO - Step 56000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 08:06:44 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.5920277240703402e+30
2025-08-30 08:06:45 - pico-train - INFO - Step 56000 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:06:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9868
2025-08-30 08:06:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.33e-05
2025-08-30 08:06:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:06:45 - pico-train - INFO - Step 56000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 08:07:00 - pico-train - INFO - Step 56025 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:07:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9389
2025-08-30 08:07:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.33e-05
2025-08-30 08:07:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:07:13 - pico-train - INFO - Step 56050 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:07:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8835
2025-08-30 08:07:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.33e-05
2025-08-30 08:07:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:07:25 - pico-train - INFO - Step 56075 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:07:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8286
2025-08-30 08:07:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.32e-05
2025-08-30 08:07:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:07:38 - pico-train - INFO - Step 56100 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:07:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8313
2025-08-30 08:07:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.32e-05
2025-08-30 08:07:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:07:51 - pico-train - INFO - Step 56125 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:07:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8921
2025-08-30 08:07:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.32e-05
2025-08-30 08:07:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:08:03 - pico-train - INFO - Step 56150 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:08:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8274
2025-08-30 08:08:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.32e-05
2025-08-30 08:08:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:08:16 - pico-train - INFO - Step 56175 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:08:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9244
2025-08-30 08:08:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.31e-05
2025-08-30 08:08:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:08:29 - pico-train - INFO - Step 56200 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:08:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 6.0019
2025-08-30 08:08:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.31e-05
2025-08-30 08:08:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:08:41 - pico-train - INFO - Step 56225 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:08:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8920
2025-08-30 08:08:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.31e-05
2025-08-30 08:08:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:08:54 - pico-train - INFO - Step 56250 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:08:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8811
2025-08-30 08:08:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.31e-05
2025-08-30 08:08:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:09:07 - pico-train - INFO - Step 56275 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:09:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9166
2025-08-30 08:09:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.31e-05
2025-08-30 08:09:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:09:19 - pico-train - INFO - Step 56300 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:09:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8974
2025-08-30 08:09:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.30e-05
2025-08-30 08:09:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:09:32 - pico-train - INFO - Step 56325 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:09:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8989
2025-08-30 08:09:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.30e-05
2025-08-30 08:09:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:09:45 - pico-train - INFO - Step 56350 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:09:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8976
2025-08-30 08:09:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.30e-05
2025-08-30 08:09:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:09:57 - pico-train - INFO - Step 56375 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:09:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9189
2025-08-30 08:09:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.30e-05
2025-08-30 08:09:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:10:10 - pico-train - INFO - Step 56400 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:10:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8489
2025-08-30 08:10:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.30e-05
2025-08-30 08:10:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:10:23 - pico-train - INFO - Step 56425 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:10:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9099
2025-08-30 08:10:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.29e-05
2025-08-30 08:10:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:10:35 - pico-train - INFO - Step 56450 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:10:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8612
2025-08-30 08:10:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.29e-05
2025-08-30 08:10:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:10:48 - pico-train - INFO - Step 56475 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:10:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8795
2025-08-30 08:10:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.29e-05
2025-08-30 08:10:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:11:00 - pico-train - INFO - Step 56500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 08:12:53 - pico-train - INFO - Step 56500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 08:12:53 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.7892090663438402e+30
2025-08-30 08:12:55 - pico-train - INFO - Step 56500 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:12:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8945
2025-08-30 08:12:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.29e-05
2025-08-30 08:12:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:12:55 - pico-train - INFO - Step 56500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 08:13:10 - pico-train - INFO - Step 56525 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:13:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8448
2025-08-30 08:13:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.28e-05
2025-08-30 08:13:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:13:23 - pico-train - INFO - Step 56550 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:13:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8696
2025-08-30 08:13:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.28e-05
2025-08-30 08:13:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:13:35 - pico-train - INFO - Step 56575 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:13:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8567
2025-08-30 08:13:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.28e-05
2025-08-30 08:13:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:13:48 - pico-train - INFO - Step 56600 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:13:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8884
2025-08-30 08:13:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.28e-05
2025-08-30 08:13:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:14:01 - pico-train - INFO - Step 56625 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:14:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9933
2025-08-30 08:14:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.28e-05
2025-08-30 08:14:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:14:13 - pico-train - INFO - Step 56650 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:14:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8716
2025-08-30 08:14:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.27e-05
2025-08-30 08:14:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:14:26 - pico-train - INFO - Step 56675 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:14:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9089
2025-08-30 08:14:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.27e-05
2025-08-30 08:14:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:14:39 - pico-train - INFO - Step 56700 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:14:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8769
2025-08-30 08:14:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.27e-05
2025-08-30 08:14:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:14:51 - pico-train - INFO - Step 56725 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:14:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8934
2025-08-30 08:14:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.27e-05
2025-08-30 08:14:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:15:04 - pico-train - INFO - Step 56750 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:15:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8711
2025-08-30 08:15:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.27e-05
2025-08-30 08:15:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:15:16 - pico-train - INFO - Step 56775 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:15:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8866
2025-08-30 08:15:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.26e-05
2025-08-30 08:15:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:15:29 - pico-train - INFO - Step 56800 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:15:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9154
2025-08-30 08:15:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.26e-05
2025-08-30 08:15:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:15:42 - pico-train - INFO - Step 56825 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:15:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8844
2025-08-30 08:15:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.26e-05
2025-08-30 08:15:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:15:54 - pico-train - INFO - Step 56850 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:15:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9142
2025-08-30 08:15:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.26e-05
2025-08-30 08:15:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:16:07 - pico-train - INFO - Step 56875 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:16:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8741
2025-08-30 08:16:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.25e-05
2025-08-30 08:16:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:16:20 - pico-train - INFO - Step 56900 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:16:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9399
2025-08-30 08:16:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.25e-05
2025-08-30 08:16:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:16:32 - pico-train - INFO - Step 56925 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:16:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8245
2025-08-30 08:16:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.25e-05
2025-08-30 08:16:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:16:45 - pico-train - INFO - Step 56950 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:16:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9157
2025-08-30 08:16:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.25e-05
2025-08-30 08:16:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:16:58 - pico-train - INFO - Step 56975 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:16:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8869
2025-08-30 08:16:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.25e-05
2025-08-30 08:16:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:17:10 - pico-train - INFO - Step 57000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 08:19:09 - pico-train - INFO - Step 57000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 08:19:09 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.2911292914273982e+30
2025-08-30 08:19:12 - pico-train - INFO - Step 57000 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:19:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8465
2025-08-30 08:19:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.24e-05
2025-08-30 08:19:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:19:12 - pico-train - INFO - Step 57000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 08:19:27 - pico-train - INFO - Step 57025 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:19:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8565
2025-08-30 08:19:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.24e-05
2025-08-30 08:19:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:19:39 - pico-train - INFO - Step 57050 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:19:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8628
2025-08-30 08:19:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.24e-05
2025-08-30 08:19:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:19:52 - pico-train - INFO - Step 57075 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:19:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9016
2025-08-30 08:19:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.24e-05
2025-08-30 08:19:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:20:04 - pico-train - INFO - Step 57100 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:20:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9662
2025-08-30 08:20:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.24e-05
2025-08-30 08:20:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:20:17 - pico-train - INFO - Step 57125 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:20:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8192
2025-08-30 08:20:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.23e-05
2025-08-30 08:20:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:20:30 - pico-train - INFO - Step 57150 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:20:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9000
2025-08-30 08:20:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.23e-05
2025-08-30 08:20:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:20:42 - pico-train - INFO - Step 57175 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:20:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7458
2025-08-30 08:20:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.23e-05
2025-08-30 08:20:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:20:55 - pico-train - INFO - Step 57200 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:20:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8635
2025-08-30 08:20:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.23e-05
2025-08-30 08:20:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:21:08 - pico-train - INFO - Step 57225 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:21:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9097
2025-08-30 08:21:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.23e-05
2025-08-30 08:21:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:21:20 - pico-train - INFO - Step 57250 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:21:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9121
2025-08-30 08:21:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.22e-05
2025-08-30 08:21:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:21:33 - pico-train - INFO - Step 57275 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:21:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8948
2025-08-30 08:21:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.22e-05
2025-08-30 08:21:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:21:46 - pico-train - INFO - Step 57300 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:21:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8280
2025-08-30 08:21:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.22e-05
2025-08-30 08:21:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:21:58 - pico-train - INFO - Step 57325 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:21:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8445
2025-08-30 08:21:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.22e-05
2025-08-30 08:21:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:22:11 - pico-train - INFO - Step 57350 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:22:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9213
2025-08-30 08:22:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.21e-05
2025-08-30 08:22:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:22:24 - pico-train - INFO - Step 57375 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:22:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9795
2025-08-30 08:22:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.21e-05
2025-08-30 08:22:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:22:36 - pico-train - INFO - Step 57400 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:22:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9827
2025-08-30 08:22:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.21e-05
2025-08-30 08:22:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:22:49 - pico-train - INFO - Step 57425 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:22:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9802
2025-08-30 08:22:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.21e-05
2025-08-30 08:22:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:23:01 - pico-train - INFO - Step 57450 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:23:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8669
2025-08-30 08:23:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.21e-05
2025-08-30 08:23:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:23:14 - pico-train - INFO - Step 57475 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:23:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8762
2025-08-30 08:23:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.20e-05
2025-08-30 08:23:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:23:26 - pico-train - INFO - Step 57500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 08:25:33 - pico-train - INFO - Step 57500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 08:25:33 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.2146898668006388e+30
2025-08-30 08:25:35 - pico-train - INFO - Step 57500 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:25:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8685
2025-08-30 08:25:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.20e-05
2025-08-30 08:25:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:25:35 - pico-train - INFO - Step 57500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 08:25:49 - pico-train - INFO - Step 57525 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:25:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8952
2025-08-30 08:25:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.20e-05
2025-08-30 08:25:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:26:02 - pico-train - INFO - Step 57550 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:26:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8838
2025-08-30 08:26:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.20e-05
2025-08-30 08:26:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:26:15 - pico-train - INFO - Step 57575 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:26:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8700
2025-08-30 08:26:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.20e-05
2025-08-30 08:26:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:26:28 - pico-train - INFO - Step 57600 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:26:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8803
2025-08-30 08:26:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.19e-05
2025-08-30 08:26:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:26:40 - pico-train - INFO - Step 57625 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:26:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9336
2025-08-30 08:26:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.19e-05
2025-08-30 08:26:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:26:53 - pico-train - INFO - Step 57650 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:26:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8840
2025-08-30 08:26:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.19e-05
2025-08-30 08:26:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:27:06 - pico-train - INFO - Step 57675 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:27:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9388
2025-08-30 08:27:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.19e-05
2025-08-30 08:27:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:27:18 - pico-train - INFO - Step 57700 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:27:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9069
2025-08-30 08:27:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.18e-05
2025-08-30 08:27:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:27:31 - pico-train - INFO - Step 57725 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:27:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9429
2025-08-30 08:27:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.18e-05
2025-08-30 08:27:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:27:44 - pico-train - INFO - Step 57750 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:27:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8362
2025-08-30 08:27:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.18e-05
2025-08-30 08:27:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:27:56 - pico-train - INFO - Step 57775 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:27:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8943
2025-08-30 08:27:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.18e-05
2025-08-30 08:27:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:28:09 - pico-train - INFO - Step 57800 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:28:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8114
2025-08-30 08:28:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.18e-05
2025-08-30 08:28:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:28:22 - pico-train - INFO - Step 57825 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:28:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9848
2025-08-30 08:28:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.17e-05
2025-08-30 08:28:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:28:34 - pico-train - INFO - Step 57850 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:28:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8611
2025-08-30 08:28:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.17e-05
2025-08-30 08:28:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:28:47 - pico-train - INFO - Step 57875 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:28:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9010
2025-08-30 08:28:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.17e-05
2025-08-30 08:28:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:29:00 - pico-train - INFO - Step 57900 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:29:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8876
2025-08-30 08:29:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.17e-05
2025-08-30 08:29:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:29:12 - pico-train - INFO - Step 57925 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:29:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9053
2025-08-30 08:29:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.17e-05
2025-08-30 08:29:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:29:25 - pico-train - INFO - Step 57950 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:29:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9021
2025-08-30 08:29:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.16e-05
2025-08-30 08:29:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:29:38 - pico-train - INFO - Step 57975 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:29:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8546
2025-08-30 08:29:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.16e-05
2025-08-30 08:29:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:29:50 - pico-train - INFO - Step 58000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 08:31:44 - pico-train - INFO - Step 58000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 08:31:44 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.9327628683408786e+30
2025-08-30 08:31:47 - pico-train - INFO - Step 58000 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:31:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8753
2025-08-30 08:31:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.16e-05
2025-08-30 08:31:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:31:47 - pico-train - INFO - Step 58000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 08:32:02 - pico-train - INFO - Step 58025 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:32:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8882
2025-08-30 08:32:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.16e-05
2025-08-30 08:32:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:32:15 - pico-train - INFO - Step 58050 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:32:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8783
2025-08-30 08:32:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.16e-05
2025-08-30 08:32:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:32:27 - pico-train - INFO - Step 58075 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:32:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8479
2025-08-30 08:32:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.15e-05
2025-08-30 08:32:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:32:40 - pico-train - INFO - Step 58100 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:32:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8465
2025-08-30 08:32:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.15e-05
2025-08-30 08:32:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:32:53 - pico-train - INFO - Step 58125 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:32:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8889
2025-08-30 08:32:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.15e-05
2025-08-30 08:32:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:33:05 - pico-train - INFO - Step 58150 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:33:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8143
2025-08-30 08:33:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.15e-05
2025-08-30 08:33:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:33:18 - pico-train - INFO - Step 58175 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:33:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9133
2025-08-30 08:33:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.14e-05
2025-08-30 08:33:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:33:31 - pico-train - INFO - Step 58200 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:33:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8496
2025-08-30 08:33:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.14e-05
2025-08-30 08:33:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:33:43 - pico-train - INFO - Step 58225 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:33:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9211
2025-08-30 08:33:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.14e-05
2025-08-30 08:33:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:33:56 - pico-train - INFO - Step 58250 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:33:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8764
2025-08-30 08:33:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.14e-05
2025-08-30 08:33:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:34:09 - pico-train - INFO - Step 58275 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:34:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9342
2025-08-30 08:34:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.14e-05
2025-08-30 08:34:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:34:21 - pico-train - INFO - Step 58300 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:34:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8601
2025-08-30 08:34:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.13e-05
2025-08-30 08:34:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:34:34 - pico-train - INFO - Step 58325 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:34:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8394
2025-08-30 08:34:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.13e-05
2025-08-30 08:34:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:34:46 - pico-train - INFO - Step 58350 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:34:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9285
2025-08-30 08:34:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.13e-05
2025-08-30 08:34:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:34:59 - pico-train - INFO - Step 58375 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:34:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8421
2025-08-30 08:34:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.13e-05
2025-08-30 08:34:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:35:12 - pico-train - INFO - Step 58400 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:35:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7891
2025-08-30 08:35:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.13e-05
2025-08-30 08:35:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:35:25 - pico-train - INFO - Step 58425 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:35:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8921
2025-08-30 08:35:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.12e-05
2025-08-30 08:35:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:35:37 - pico-train - INFO - Step 58450 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:35:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8410
2025-08-30 08:35:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.12e-05
2025-08-30 08:35:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:35:50 - pico-train - INFO - Step 58475 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:35:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8166
2025-08-30 08:35:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.12e-05
2025-08-30 08:35:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:36:02 - pico-train - INFO - Step 58500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 08:37:56 - pico-train - INFO - Step 58500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 08:37:56 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.9542125550009274e+30
2025-08-30 08:38:01 - pico-train - INFO - Step 58500 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:38:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8586
2025-08-30 08:38:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.12e-05
2025-08-30 08:38:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:38:01 - pico-train - INFO - Step 58500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 08:38:16 - pico-train - INFO - Step 58525 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:38:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8248
2025-08-30 08:38:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.12e-05
2025-08-30 08:38:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:38:29 - pico-train - INFO - Step 58550 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:38:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8162
2025-08-30 08:38:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.11e-05
2025-08-30 08:38:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:38:41 - pico-train - INFO - Step 58575 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:38:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9361
2025-08-30 08:38:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.11e-05
2025-08-30 08:38:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:38:54 - pico-train - INFO - Step 58600 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:38:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8945
2025-08-30 08:38:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.11e-05
2025-08-30 08:38:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:39:06 - pico-train - INFO - Step 58625 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:39:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7984
2025-08-30 08:39:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.11e-05
2025-08-30 08:39:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:39:19 - pico-train - INFO - Step 58650 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:39:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8764
2025-08-30 08:39:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.10e-05
2025-08-30 08:39:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:39:32 - pico-train - INFO - Step 58675 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:39:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9141
2025-08-30 08:39:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.10e-05
2025-08-30 08:39:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:39:44 - pico-train - INFO - Step 58700 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:39:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9118
2025-08-30 08:39:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.10e-05
2025-08-30 08:39:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:39:57 - pico-train - INFO - Step 58725 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:39:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8585
2025-08-30 08:39:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.10e-05
2025-08-30 08:39:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:40:10 - pico-train - INFO - Step 58750 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:40:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8661
2025-08-30 08:40:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.10e-05
2025-08-30 08:40:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:40:22 - pico-train - INFO - Step 58775 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:40:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8330
2025-08-30 08:40:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.09e-05
2025-08-30 08:40:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:40:35 - pico-train - INFO - Step 58800 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:40:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8415
2025-08-30 08:40:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.09e-05
2025-08-30 08:40:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:40:47 - pico-train - INFO - Step 58825 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:40:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9273
2025-08-30 08:40:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.09e-05
2025-08-30 08:40:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:41:00 - pico-train - INFO - Step 58850 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:41:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8663
2025-08-30 08:41:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.09e-05
2025-08-30 08:41:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:41:14 - pico-train - INFO - Step 58875 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:41:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8209
2025-08-30 08:41:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.09e-05
2025-08-30 08:41:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:41:26 - pico-train - INFO - Step 58900 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:41:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9101
2025-08-30 08:41:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.08e-05
2025-08-30 08:41:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:41:39 - pico-train - INFO - Step 58925 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:41:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9064
2025-08-30 08:41:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.08e-05
2025-08-30 08:41:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:41:51 - pico-train - INFO - Step 58950 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:41:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8527
2025-08-30 08:41:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.08e-05
2025-08-30 08:41:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:42:04 - pico-train - INFO - Step 58975 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:42:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8115
2025-08-30 08:42:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.08e-05
2025-08-30 08:42:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:42:16 - pico-train - INFO - Step 59000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 08:44:14 - pico-train - INFO - Step 59000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 08:44:14 - pico-train - INFO - โ””โ”€โ”€ paloma: 3.916054030122377e+30
2025-08-30 08:44:17 - pico-train - INFO - Step 59000 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:44:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8043
2025-08-30 08:44:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.08e-05
2025-08-30 08:44:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:44:17 - pico-train - INFO - Step 59000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 08:44:33 - pico-train - INFO - Step 59025 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:44:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7710
2025-08-30 08:44:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.07e-05
2025-08-30 08:44:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:44:45 - pico-train - INFO - Step 59050 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:44:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8913
2025-08-30 08:44:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.07e-05
2025-08-30 08:44:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:44:59 - pico-train - INFO - Step 59075 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:44:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8823
2025-08-30 08:44:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.07e-05
2025-08-30 08:44:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:45:11 - pico-train - INFO - Step 59100 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:45:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8189
2025-08-30 08:45:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.07e-05
2025-08-30 08:45:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:45:24 - pico-train - INFO - Step 59125 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:45:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7997
2025-08-30 08:45:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.06e-05
2025-08-30 08:45:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:45:37 - pico-train - INFO - Step 59150 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:45:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8950
2025-08-30 08:45:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.06e-05
2025-08-30 08:45:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:45:50 - pico-train - INFO - Step 59175 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:45:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9084
2025-08-30 08:45:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.06e-05
2025-08-30 08:45:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:46:03 - pico-train - INFO - Step 59200 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:46:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8141
2025-08-30 08:46:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.06e-05
2025-08-30 08:46:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:46:16 - pico-train - INFO - Step 59225 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:46:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8814
2025-08-30 08:46:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.06e-05
2025-08-30 08:46:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:46:28 - pico-train - INFO - Step 59250 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:46:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8316
2025-08-30 08:46:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.05e-05
2025-08-30 08:46:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:46:41 - pico-train - INFO - Step 59275 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:46:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8489
2025-08-30 08:46:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.05e-05
2025-08-30 08:46:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:46:54 - pico-train - INFO - Step 59300 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:46:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7998
2025-08-30 08:46:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.05e-05
2025-08-30 08:46:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:47:06 - pico-train - INFO - Step 59325 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:47:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8848
2025-08-30 08:47:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.05e-05
2025-08-30 08:47:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:47:19 - pico-train - INFO - Step 59350 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:47:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8543
2025-08-30 08:47:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.05e-05
2025-08-30 08:47:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:47:32 - pico-train - INFO - Step 59375 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:47:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8655
2025-08-30 08:47:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.04e-05
2025-08-30 08:47:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:47:44 - pico-train - INFO - Step 59400 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:47:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8870
2025-08-30 08:47:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.04e-05
2025-08-30 08:47:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:47:57 - pico-train - INFO - Step 59425 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:47:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8000
2025-08-30 08:47:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.04e-05
2025-08-30 08:47:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:48:10 - pico-train - INFO - Step 59450 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:48:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8162
2025-08-30 08:48:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.04e-05
2025-08-30 08:48:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:48:22 - pico-train - INFO - Step 59475 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:48:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8936
2025-08-30 08:48:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.04e-05
2025-08-30 08:48:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:48:35 - pico-train - INFO - Step 59500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 08:50:28 - pico-train - INFO - Step 59500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 08:50:28 - pico-train - INFO - โ””โ”€โ”€ paloma: 4.0666865028851395e+30
2025-08-30 08:50:32 - pico-train - INFO - Step 59500 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:50:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8731
2025-08-30 08:50:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.03e-05
2025-08-30 08:50:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:50:32 - pico-train - INFO - Step 59500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 08:50:47 - pico-train - INFO - Step 59525 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:50:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9058
2025-08-30 08:50:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.03e-05
2025-08-30 08:50:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:51:00 - pico-train - INFO - Step 59550 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:51:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8037
2025-08-30 08:51:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.03e-05
2025-08-30 08:51:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:51:12 - pico-train - INFO - Step 59575 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:51:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8553
2025-08-30 08:51:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.03e-05
2025-08-30 08:51:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:51:25 - pico-train - INFO - Step 59600 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:51:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8022
2025-08-30 08:51:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.02e-05
2025-08-30 08:51:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:51:38 - pico-train - INFO - Step 59625 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:51:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8279
2025-08-30 08:51:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.02e-05
2025-08-30 08:51:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:51:51 - pico-train - INFO - Step 59650 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:51:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7732
2025-08-30 08:51:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.02e-05
2025-08-30 08:51:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:52:03 - pico-train - INFO - Step 59675 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:52:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8738
2025-08-30 08:52:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.02e-05
2025-08-30 08:52:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:52:16 - pico-train - INFO - Step 59700 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:52:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8618
2025-08-30 08:52:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.02e-05
2025-08-30 08:52:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:52:29 - pico-train - INFO - Step 59725 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:52:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8423
2025-08-30 08:52:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.01e-05
2025-08-30 08:52:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:52:41 - pico-train - INFO - Step 59750 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:52:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9335
2025-08-30 08:52:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.01e-05
2025-08-30 08:52:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:52:54 - pico-train - INFO - Step 59775 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:52:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7709
2025-08-30 08:52:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.01e-05
2025-08-30 08:52:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:53:07 - pico-train - INFO - Step 59800 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:53:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9237
2025-08-30 08:53:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.01e-05
2025-08-30 08:53:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:53:19 - pico-train - INFO - Step 59825 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:53:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9029
2025-08-30 08:53:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.01e-05
2025-08-30 08:53:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:53:32 - pico-train - INFO - Step 59850 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:53:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9280
2025-08-30 08:53:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-30 08:53:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:53:45 - pico-train - INFO - Step 59875 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:53:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8758
2025-08-30 08:53:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-30 08:53:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:53:58 - pico-train - INFO - Step 59900 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:53:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8195
2025-08-30 08:53:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-30 08:53:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:54:11 - pico-train - INFO - Step 59925 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:54:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9247
2025-08-30 08:54:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-30 08:54:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:54:23 - pico-train - INFO - Step 59950 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:54:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8941
2025-08-30 08:54:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 2.00e-05
2025-08-30 08:54:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:54:36 - pico-train - INFO - Step 59975 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:54:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9192
2025-08-30 08:54:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-05
2025-08-30 08:54:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:54:48 - pico-train - INFO - Step 60000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 08:56:42 - pico-train - INFO - Step 60000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 08:56:42 - pico-train - INFO - โ””โ”€โ”€ paloma: 5.67735563606023e+30
2025-08-30 08:56:44 - pico-train - INFO - Step 60000 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:56:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9175
2025-08-30 08:56:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-05
2025-08-30 08:56:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:56:44 - pico-train - INFO - Step 60000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 08:56:59 - pico-train - INFO - Step 60025 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:56:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8005
2025-08-30 08:56:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-05
2025-08-30 08:56:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:57:12 - pico-train - INFO - Step 60050 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:57:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8668
2025-08-30 08:57:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-05
2025-08-30 08:57:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:57:24 - pico-train - INFO - Step 60075 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:57:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9150
2025-08-30 08:57:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.99e-05
2025-08-30 08:57:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:57:37 - pico-train - INFO - Step 60100 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:57:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8577
2025-08-30 08:57:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-05
2025-08-30 08:57:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:57:50 - pico-train - INFO - Step 60125 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:57:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9463
2025-08-30 08:57:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-05
2025-08-30 08:57:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:58:03 - pico-train - INFO - Step 60150 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:58:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9613
2025-08-30 08:58:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-05
2025-08-30 08:58:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:58:15 - pico-train - INFO - Step 60175 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:58:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7742
2025-08-30 08:58:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.98e-05
2025-08-30 08:58:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:58:28 - pico-train - INFO - Step 60200 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:58:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9330
2025-08-30 08:58:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-05
2025-08-30 08:58:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:58:40 - pico-train - INFO - Step 60225 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:58:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9165
2025-08-30 08:58:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-05
2025-08-30 08:58:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:58:53 - pico-train - INFO - Step 60250 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:58:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8891
2025-08-30 08:58:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-05
2025-08-30 08:58:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:59:06 - pico-train - INFO - Step 60275 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:59:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8293
2025-08-30 08:59:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-05
2025-08-30 08:59:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:59:18 - pico-train - INFO - Step 60300 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:59:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7729
2025-08-30 08:59:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.97e-05
2025-08-30 08:59:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:59:31 - pico-train - INFO - Step 60325 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:59:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8043
2025-08-30 08:59:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-05
2025-08-30 08:59:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:59:43 - pico-train - INFO - Step 60350 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:59:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8123
2025-08-30 08:59:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-05
2025-08-30 08:59:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 08:59:57 - pico-train - INFO - Step 60375 -- ๐Ÿ”„ Training Metrics
2025-08-30 08:59:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9085
2025-08-30 08:59:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-05
2025-08-30 08:59:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:00:09 - pico-train - INFO - Step 60400 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:00:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8004
2025-08-30 09:00:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-05
2025-08-30 09:00:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:00:22 - pico-train - INFO - Step 60425 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:00:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8664
2025-08-30 09:00:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.96e-05
2025-08-30 09:00:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:00:35 - pico-train - INFO - Step 60450 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:00:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8370
2025-08-30 09:00:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-05
2025-08-30 09:00:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:00:48 - pico-train - INFO - Step 60475 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:00:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8813
2025-08-30 09:00:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-05
2025-08-30 09:00:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:01:01 - pico-train - INFO - Step 60500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 09:03:00 - pico-train - INFO - Step 60500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 09:03:00 - pico-train - INFO - โ””โ”€โ”€ paloma: 6.577053610858546e+30
2025-08-30 09:03:04 - pico-train - INFO - Step 60500 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:03:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8644
2025-08-30 09:03:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-05
2025-08-30 09:03:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:03:04 - pico-train - INFO - Step 60500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 09:03:20 - pico-train - INFO - Step 60525 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:03:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9048
2025-08-30 09:03:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-05
2025-08-30 09:03:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:03:32 - pico-train - INFO - Step 60550 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:03:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8286
2025-08-30 09:03:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.95e-05
2025-08-30 09:03:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:03:45 - pico-train - INFO - Step 60575 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:03:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9112
2025-08-30 09:03:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-05
2025-08-30 09:03:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:03:58 - pico-train - INFO - Step 60600 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:03:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8445
2025-08-30 09:03:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-05
2025-08-30 09:03:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:04:10 - pico-train - INFO - Step 60625 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:04:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8444
2025-08-30 09:04:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-05
2025-08-30 09:04:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:04:23 - pico-train - INFO - Step 60650 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:04:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7993
2025-08-30 09:04:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-05
2025-08-30 09:04:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:04:36 - pico-train - INFO - Step 60675 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:04:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8188
2025-08-30 09:04:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.94e-05
2025-08-30 09:04:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:04:48 - pico-train - INFO - Step 60700 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:04:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8257
2025-08-30 09:04:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-05
2025-08-30 09:04:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:05:01 - pico-train - INFO - Step 60725 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:05:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9364
2025-08-30 09:05:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-05
2025-08-30 09:05:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:05:14 - pico-train - INFO - Step 60750 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:05:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8968
2025-08-30 09:05:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-05
2025-08-30 09:05:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:05:26 - pico-train - INFO - Step 60775 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:05:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7561
2025-08-30 09:05:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.93e-05
2025-08-30 09:05:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:05:39 - pico-train - INFO - Step 60800 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:05:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8257
2025-08-30 09:05:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-05
2025-08-30 09:05:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:05:52 - pico-train - INFO - Step 60825 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:05:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8018
2025-08-30 09:05:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-05
2025-08-30 09:05:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:06:04 - pico-train - INFO - Step 60850 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:06:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8325
2025-08-30 09:06:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-05
2025-08-30 09:06:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:06:17 - pico-train - INFO - Step 60875 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:06:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9502
2025-08-30 09:06:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-05
2025-08-30 09:06:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:06:30 - pico-train - INFO - Step 60900 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:06:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8632
2025-08-30 09:06:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.92e-05
2025-08-30 09:06:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:06:42 - pico-train - INFO - Step 60925 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:06:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7790
2025-08-30 09:06:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-05
2025-08-30 09:06:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:06:55 - pico-train - INFO - Step 60950 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:06:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8264
2025-08-30 09:06:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-05
2025-08-30 09:06:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:07:08 - pico-train - INFO - Step 60975 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:07:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8425
2025-08-30 09:07:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-05
2025-08-30 09:07:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:07:20 - pico-train - INFO - Step 61000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 09:09:19 - pico-train - INFO - Step 61000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 09:09:19 - pico-train - INFO - โ””โ”€โ”€ paloma: 7.381800813081388e+30
2025-08-30 09:09:23 - pico-train - INFO - Step 61000 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:09:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8442
2025-08-30 09:09:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-05
2025-08-30 09:09:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:09:23 - pico-train - INFO - Step 61000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 09:09:38 - pico-train - INFO - Step 61025 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:09:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9313
2025-08-30 09:09:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.91e-05
2025-08-30 09:09:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:09:51 - pico-train - INFO - Step 61050 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:09:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8519
2025-08-30 09:09:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-05
2025-08-30 09:09:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:10:03 - pico-train - INFO - Step 61075 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:10:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8725
2025-08-30 09:10:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-05
2025-08-30 09:10:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:10:16 - pico-train - INFO - Step 61100 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:10:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8322
2025-08-30 09:10:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-05
2025-08-30 09:10:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:10:29 - pico-train - INFO - Step 61125 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:10:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8354
2025-08-30 09:10:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-05
2025-08-30 09:10:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:10:41 - pico-train - INFO - Step 61150 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:10:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8735
2025-08-30 09:10:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.90e-05
2025-08-30 09:10:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:10:54 - pico-train - INFO - Step 61175 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:10:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9433
2025-08-30 09:10:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-05
2025-08-30 09:10:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:11:06 - pico-train - INFO - Step 61200 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:11:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8394
2025-08-30 09:11:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-05
2025-08-30 09:11:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:11:19 - pico-train - INFO - Step 61225 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:11:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9396
2025-08-30 09:11:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-05
2025-08-30 09:11:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:11:32 - pico-train - INFO - Step 61250 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:11:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8461
2025-08-30 09:11:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-05
2025-08-30 09:11:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:11:44 - pico-train - INFO - Step 61275 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:11:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9137
2025-08-30 09:11:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.89e-05
2025-08-30 09:11:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:11:57 - pico-train - INFO - Step 61300 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:11:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8249
2025-08-30 09:11:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-05
2025-08-30 09:11:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:12:10 - pico-train - INFO - Step 61325 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:12:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8248
2025-08-30 09:12:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-05
2025-08-30 09:12:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:12:22 - pico-train - INFO - Step 61350 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:12:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8349
2025-08-30 09:12:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-05
2025-08-30 09:12:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:12:35 - pico-train - INFO - Step 61375 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:12:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8265
2025-08-30 09:12:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.88e-05
2025-08-30 09:12:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:12:47 - pico-train - INFO - Step 61400 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:12:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8919
2025-08-30 09:12:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-05
2025-08-30 09:12:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:13:00 - pico-train - INFO - Step 61425 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:13:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8929
2025-08-30 09:13:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-05
2025-08-30 09:13:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:13:13 - pico-train - INFO - Step 61450 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:13:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8063
2025-08-30 09:13:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-05
2025-08-30 09:13:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:13:25 - pico-train - INFO - Step 61475 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:13:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8834
2025-08-30 09:13:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-05
2025-08-30 09:13:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:13:38 - pico-train - INFO - Step 61500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 09:15:52 - pico-train - INFO - Step 61500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 09:15:52 - pico-train - INFO - โ””โ”€โ”€ paloma: 7.5580512131553e+30
2025-08-30 09:15:55 - pico-train - INFO - Step 61500 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:15:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8274
2025-08-30 09:15:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.87e-05
2025-08-30 09:15:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:15:55 - pico-train - INFO - Step 61500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 09:16:20 - pico-train - INFO - Step 61525 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:16:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8780
2025-08-30 09:16:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-05
2025-08-30 09:16:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:16:33 - pico-train - INFO - Step 61550 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:16:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8784
2025-08-30 09:16:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-05
2025-08-30 09:16:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:16:45 - pico-train - INFO - Step 61575 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:16:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8547
2025-08-30 09:16:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-05
2025-08-30 09:16:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:16:58 - pico-train - INFO - Step 61600 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:16:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8624
2025-08-30 09:16:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-05
2025-08-30 09:16:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:17:11 - pico-train - INFO - Step 61625 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:17:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9047
2025-08-30 09:17:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.86e-05
2025-08-30 09:17:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:17:24 - pico-train - INFO - Step 61650 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:17:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8888
2025-08-30 09:17:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-05
2025-08-30 09:17:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:17:36 - pico-train - INFO - Step 61675 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:17:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8195
2025-08-30 09:17:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-05
2025-08-30 09:17:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:17:49 - pico-train - INFO - Step 61700 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:17:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8452
2025-08-30 09:17:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-05
2025-08-30 09:17:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:18:02 - pico-train - INFO - Step 61725 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:18:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9150
2025-08-30 09:18:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-05
2025-08-30 09:18:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:18:14 - pico-train - INFO - Step 61750 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:18:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7953
2025-08-30 09:18:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.85e-05
2025-08-30 09:18:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:18:27 - pico-train - INFO - Step 61775 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:18:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8075
2025-08-30 09:18:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-05
2025-08-30 09:18:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:18:40 - pico-train - INFO - Step 61800 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:18:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8305
2025-08-30 09:18:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-05
2025-08-30 09:18:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:18:53 - pico-train - INFO - Step 61825 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:18:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8460
2025-08-30 09:18:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-05
2025-08-30 09:18:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:19:05 - pico-train - INFO - Step 61850 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:19:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9274
2025-08-30 09:19:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-05
2025-08-30 09:19:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:19:18 - pico-train - INFO - Step 61875 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:19:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8535
2025-08-30 09:19:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.84e-05
2025-08-30 09:19:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:19:31 - pico-train - INFO - Step 61900 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:19:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8254
2025-08-30 09:19:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-05
2025-08-30 09:19:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:19:44 - pico-train - INFO - Step 61925 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:19:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6957
2025-08-30 09:19:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-05
2025-08-30 09:19:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:19:56 - pico-train - INFO - Step 61950 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:19:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8474
2025-08-30 09:19:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-05
2025-08-30 09:19:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:20:09 - pico-train - INFO - Step 61975 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:20:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8588
2025-08-30 09:20:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-05
2025-08-30 09:20:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:20:26 - pico-train - INFO - Step 62000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 09:22:33 - pico-train - INFO - Step 62000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 09:22:33 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.0115134118607476e+31
2025-08-30 09:22:37 - pico-train - INFO - Step 62000 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:22:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8579
2025-08-30 09:22:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.83e-05
2025-08-30 09:22:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:22:37 - pico-train - INFO - Step 62000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 09:22:52 - pico-train - INFO - Step 62025 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:22:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8263
2025-08-30 09:22:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-05
2025-08-30 09:22:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:23:05 - pico-train - INFO - Step 62050 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:23:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8617
2025-08-30 09:23:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-05
2025-08-30 09:23:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:23:18 - pico-train - INFO - Step 62075 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:23:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8762
2025-08-30 09:23:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-05
2025-08-30 09:23:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:23:30 - pico-train - INFO - Step 62100 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:23:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8857
2025-08-30 09:23:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-05
2025-08-30 09:23:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:23:43 - pico-train - INFO - Step 62125 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:23:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7406
2025-08-30 09:23:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.82e-05
2025-08-30 09:23:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:23:55 - pico-train - INFO - Step 62150 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:23:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8648
2025-08-30 09:23:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-05
2025-08-30 09:23:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:24:08 - pico-train - INFO - Step 62175 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:24:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8611
2025-08-30 09:24:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-05
2025-08-30 09:24:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:24:21 - pico-train - INFO - Step 62200 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:24:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8327
2025-08-30 09:24:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-05
2025-08-30 09:24:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:24:33 - pico-train - INFO - Step 62225 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:24:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8680
2025-08-30 09:24:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.81e-05
2025-08-30 09:24:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:24:46 - pico-train - INFO - Step 62250 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:24:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8013
2025-08-30 09:24:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-05
2025-08-30 09:24:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:24:58 - pico-train - INFO - Step 62275 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:24:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7716
2025-08-30 09:24:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-05
2025-08-30 09:24:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:25:11 - pico-train - INFO - Step 62300 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:25:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8227
2025-08-30 09:25:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-05
2025-08-30 09:25:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:25:24 - pico-train - INFO - Step 62325 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:25:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8460
2025-08-30 09:25:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-05
2025-08-30 09:25:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:25:37 - pico-train - INFO - Step 62350 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:25:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8503
2025-08-30 09:25:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.80e-05
2025-08-30 09:25:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:25:49 - pico-train - INFO - Step 62375 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:25:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7188
2025-08-30 09:25:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-05
2025-08-30 09:25:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:26:02 - pico-train - INFO - Step 62400 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:26:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8399
2025-08-30 09:26:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-05
2025-08-30 09:26:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:26:15 - pico-train - INFO - Step 62425 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:26:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8522
2025-08-30 09:26:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-05
2025-08-30 09:26:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:26:27 - pico-train - INFO - Step 62450 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:26:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8175
2025-08-30 09:26:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-05
2025-08-30 09:26:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:26:40 - pico-train - INFO - Step 62475 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:26:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9304
2025-08-30 09:26:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.79e-05
2025-08-30 09:26:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:26:53 - pico-train - INFO - Step 62500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 09:28:59 - pico-train - INFO - Step 62500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 09:28:59 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.026584430453375e+31
2025-08-30 09:29:02 - pico-train - INFO - Step 62500 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:29:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9047
2025-08-30 09:29:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-05
2025-08-30 09:29:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:29:02 - pico-train - INFO - Step 62500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 09:29:17 - pico-train - INFO - Step 62525 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:29:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8436
2025-08-30 09:29:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-05
2025-08-30 09:29:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:29:29 - pico-train - INFO - Step 62550 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:29:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8456
2025-08-30 09:29:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-05
2025-08-30 09:29:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:29:42 - pico-train - INFO - Step 62575 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:29:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8538
2025-08-30 09:29:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-05
2025-08-30 09:29:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:29:55 - pico-train - INFO - Step 62600 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:29:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9303
2025-08-30 09:29:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.78e-05
2025-08-30 09:29:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:30:08 - pico-train - INFO - Step 62625 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:30:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8303
2025-08-30 09:30:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-05
2025-08-30 09:30:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:30:21 - pico-train - INFO - Step 62650 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:30:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8259
2025-08-30 09:30:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-05
2025-08-30 09:30:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:30:33 - pico-train - INFO - Step 62675 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:30:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8603
2025-08-30 09:30:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-05
2025-08-30 09:30:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:30:46 - pico-train - INFO - Step 62700 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:30:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8287
2025-08-30 09:30:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-05
2025-08-30 09:30:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:30:59 - pico-train - INFO - Step 62725 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:30:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8268
2025-08-30 09:30:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.77e-05
2025-08-30 09:30:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:31:12 - pico-train - INFO - Step 62750 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:31:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8671
2025-08-30 09:31:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-05
2025-08-30 09:31:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:31:25 - pico-train - INFO - Step 62775 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:31:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7714
2025-08-30 09:31:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-05
2025-08-30 09:31:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:31:37 - pico-train - INFO - Step 62800 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:31:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8034
2025-08-30 09:31:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-05
2025-08-30 09:31:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:31:50 - pico-train - INFO - Step 62825 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:31:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8833
2025-08-30 09:31:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-05
2025-08-30 09:31:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:32:03 - pico-train - INFO - Step 62850 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:32:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7885
2025-08-30 09:32:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.76e-05
2025-08-30 09:32:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:32:15 - pico-train - INFO - Step 62875 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:32:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8884
2025-08-30 09:32:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.75e-05
2025-08-30 09:32:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:32:28 - pico-train - INFO - Step 62900 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:32:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7919
2025-08-30 09:32:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.75e-05
2025-08-30 09:32:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:32:41 - pico-train - INFO - Step 62925 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:32:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8612
2025-08-30 09:32:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.75e-05
2025-08-30 09:32:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:32:54 - pico-train - INFO - Step 62950 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:32:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7049
2025-08-30 09:32:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.75e-05
2025-08-30 09:32:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:33:06 - pico-train - INFO - Step 62975 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:33:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8447
2025-08-30 09:33:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.75e-05
2025-08-30 09:33:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:33:19 - pico-train - INFO - Step 63000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 09:35:29 - pico-train - INFO - Step 63000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 09:35:29 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.053901252110863e+31
2025-08-30 09:35:31 - pico-train - INFO - Step 63000 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:35:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8600
2025-08-30 09:35:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-05
2025-08-30 09:35:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:35:31 - pico-train - INFO - Step 63000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 09:35:46 - pico-train - INFO - Step 63025 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:35:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8323
2025-08-30 09:35:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-05
2025-08-30 09:35:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:35:59 - pico-train - INFO - Step 63050 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:35:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7825
2025-08-30 09:35:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-05
2025-08-30 09:35:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:36:12 - pico-train - INFO - Step 63075 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:36:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8469
2025-08-30 09:36:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-05
2025-08-30 09:36:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:36:24 - pico-train - INFO - Step 63100 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:36:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8636
2025-08-30 09:36:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.74e-05
2025-08-30 09:36:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:36:37 - pico-train - INFO - Step 63125 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:36:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8131
2025-08-30 09:36:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-05
2025-08-30 09:36:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:36:50 - pico-train - INFO - Step 63150 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:36:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8570
2025-08-30 09:36:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-05
2025-08-30 09:36:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:37:02 - pico-train - INFO - Step 63175 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:37:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9120
2025-08-30 09:37:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-05
2025-08-30 09:37:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:37:15 - pico-train - INFO - Step 63200 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:37:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7894
2025-08-30 09:37:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-05
2025-08-30 09:37:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:37:28 - pico-train - INFO - Step 63225 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:37:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7796
2025-08-30 09:37:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.73e-05
2025-08-30 09:37:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:37:41 - pico-train - INFO - Step 63250 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:37:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7788
2025-08-30 09:37:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.72e-05
2025-08-30 09:37:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:37:53 - pico-train - INFO - Step 63275 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:37:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9341
2025-08-30 09:37:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.72e-05
2025-08-30 09:37:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:38:06 - pico-train - INFO - Step 63300 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:38:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7428
2025-08-30 09:38:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.72e-05
2025-08-30 09:38:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:38:19 - pico-train - INFO - Step 63325 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:38:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8475
2025-08-30 09:38:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.72e-05
2025-08-30 09:38:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:38:31 - pico-train - INFO - Step 63350 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:38:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8675
2025-08-30 09:38:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.72e-05
2025-08-30 09:38:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:38:44 - pico-train - INFO - Step 63375 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:38:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8387
2025-08-30 09:38:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.71e-05
2025-08-30 09:38:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:38:57 - pico-train - INFO - Step 63400 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:38:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8082
2025-08-30 09:38:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.71e-05
2025-08-30 09:38:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:39:09 - pico-train - INFO - Step 63425 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:39:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8823
2025-08-30 09:39:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.71e-05
2025-08-30 09:39:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:39:22 - pico-train - INFO - Step 63450 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:39:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8131
2025-08-30 09:39:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.71e-05
2025-08-30 09:39:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:39:35 - pico-train - INFO - Step 63475 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:39:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8368
2025-08-30 09:39:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.71e-05
2025-08-30 09:39:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:39:48 - pico-train - INFO - Step 63500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 09:41:56 - pico-train - INFO - Step 63500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 09:41:56 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.3798321560822609e+31
2025-08-30 09:41:59 - pico-train - INFO - Step 63500 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:41:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8774
2025-08-30 09:41:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-05
2025-08-30 09:41:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:41:59 - pico-train - INFO - Step 63500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 09:42:14 - pico-train - INFO - Step 63525 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:42:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8403
2025-08-30 09:42:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-05
2025-08-30 09:42:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:42:27 - pico-train - INFO - Step 63550 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:42:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8268
2025-08-30 09:42:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-05
2025-08-30 09:42:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:42:40 - pico-train - INFO - Step 63575 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:42:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8713
2025-08-30 09:42:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-05
2025-08-30 09:42:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:42:52 - pico-train - INFO - Step 63600 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:42:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9887
2025-08-30 09:42:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.70e-05
2025-08-30 09:42:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:43:05 - pico-train - INFO - Step 63625 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:43:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7719
2025-08-30 09:43:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.69e-05
2025-08-30 09:43:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:43:18 - pico-train - INFO - Step 63650 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:43:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9020
2025-08-30 09:43:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.69e-05
2025-08-30 09:43:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:43:30 - pico-train - INFO - Step 63675 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:43:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7964
2025-08-30 09:43:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.69e-05
2025-08-30 09:43:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:43:43 - pico-train - INFO - Step 63700 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:43:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7920
2025-08-30 09:43:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.69e-05
2025-08-30 09:43:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:43:55 - pico-train - INFO - Step 63725 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:43:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7781
2025-08-30 09:43:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.68e-05
2025-08-30 09:43:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:44:08 - pico-train - INFO - Step 63750 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:44:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8701
2025-08-30 09:44:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.68e-05
2025-08-30 09:44:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:44:21 - pico-train - INFO - Step 63775 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:44:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7957
2025-08-30 09:44:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.68e-05
2025-08-30 09:44:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:44:33 - pico-train - INFO - Step 63800 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:44:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8493
2025-08-30 09:44:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.68e-05
2025-08-30 09:44:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:44:46 - pico-train - INFO - Step 63825 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:44:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8591
2025-08-30 09:44:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.68e-05
2025-08-30 09:44:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:44:59 - pico-train - INFO - Step 63850 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:44:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9283
2025-08-30 09:44:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-05
2025-08-30 09:44:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:45:12 - pico-train - INFO - Step 63875 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:45:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8760
2025-08-30 09:45:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-05
2025-08-30 09:45:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:45:24 - pico-train - INFO - Step 63900 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:45:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8496
2025-08-30 09:45:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-05
2025-08-30 09:45:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:45:37 - pico-train - INFO - Step 63925 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:45:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7896
2025-08-30 09:45:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-05
2025-08-30 09:45:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:45:50 - pico-train - INFO - Step 63950 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:45:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8621
2025-08-30 09:45:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.67e-05
2025-08-30 09:45:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:46:02 - pico-train - INFO - Step 63975 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:46:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8765
2025-08-30 09:46:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.66e-05
2025-08-30 09:46:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:46:15 - pico-train - INFO - Step 64000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 09:48:17 - pico-train - INFO - Step 64000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 09:48:17 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.5176259204672668e+31
2025-08-30 09:48:20 - pico-train - INFO - Step 64000 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:48:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9281
2025-08-30 09:48:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.66e-05
2025-08-30 09:48:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:48:20 - pico-train - INFO - Step 64000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 09:48:35 - pico-train - INFO - Step 64025 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:48:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8790
2025-08-30 09:48:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.66e-05
2025-08-30 09:48:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:48:47 - pico-train - INFO - Step 64050 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:48:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8652
2025-08-30 09:48:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.66e-05
2025-08-30 09:48:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:49:00 - pico-train - INFO - Step 64075 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:49:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8631
2025-08-30 09:49:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.66e-05
2025-08-30 09:49:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:49:12 - pico-train - INFO - Step 64100 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:49:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8123
2025-08-30 09:49:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.65e-05
2025-08-30 09:49:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:49:25 - pico-train - INFO - Step 64125 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:49:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8136
2025-08-30 09:49:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.65e-05
2025-08-30 09:49:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:49:38 - pico-train - INFO - Step 64150 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:49:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8727
2025-08-30 09:49:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.65e-05
2025-08-30 09:49:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:49:50 - pico-train - INFO - Step 64175 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:49:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8386
2025-08-30 09:49:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.65e-05
2025-08-30 09:49:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:50:03 - pico-train - INFO - Step 64200 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:50:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8189
2025-08-30 09:50:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.65e-05
2025-08-30 09:50:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:50:16 - pico-train - INFO - Step 64225 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:50:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8936
2025-08-30 09:50:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.64e-05
2025-08-30 09:50:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:50:29 - pico-train - INFO - Step 64250 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:50:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8517
2025-08-30 09:50:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.64e-05
2025-08-30 09:50:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:50:41 - pico-train - INFO - Step 64275 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:50:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9134
2025-08-30 09:50:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.64e-05
2025-08-30 09:50:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:50:54 - pico-train - INFO - Step 64300 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:50:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8338
2025-08-30 09:50:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.64e-05
2025-08-30 09:50:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:51:07 - pico-train - INFO - Step 64325 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:51:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9309
2025-08-30 09:51:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.64e-05
2025-08-30 09:51:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:51:19 - pico-train - INFO - Step 64350 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:51:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8091
2025-08-30 09:51:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.63e-05
2025-08-30 09:51:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:51:32 - pico-train - INFO - Step 64375 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:51:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8666
2025-08-30 09:51:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.63e-05
2025-08-30 09:51:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:51:44 - pico-train - INFO - Step 64400 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:51:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7732
2025-08-30 09:51:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.63e-05
2025-08-30 09:51:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:51:57 - pico-train - INFO - Step 64425 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:51:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8354
2025-08-30 09:51:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.63e-05
2025-08-30 09:51:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:52:10 - pico-train - INFO - Step 64450 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:52:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8674
2025-08-30 09:52:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.63e-05
2025-08-30 09:52:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:52:23 - pico-train - INFO - Step 64475 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:52:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8365
2025-08-30 09:52:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.62e-05
2025-08-30 09:52:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:52:36 - pico-train - INFO - Step 64500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 09:54:35 - pico-train - INFO - Step 64500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 09:54:35 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.7413715227937596e+31
2025-08-30 09:54:37 - pico-train - INFO - Step 64500 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:54:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7904
2025-08-30 09:54:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.62e-05
2025-08-30 09:54:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:54:37 - pico-train - INFO - Step 64500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 09:54:52 - pico-train - INFO - Step 64525 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:54:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7861
2025-08-30 09:54:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.62e-05
2025-08-30 09:54:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:55:05 - pico-train - INFO - Step 64550 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:55:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7797
2025-08-30 09:55:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.62e-05
2025-08-30 09:55:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:55:18 - pico-train - INFO - Step 64575 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:55:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7777
2025-08-30 09:55:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.62e-05
2025-08-30 09:55:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:55:30 - pico-train - INFO - Step 64600 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:55:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8649
2025-08-30 09:55:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.61e-05
2025-08-30 09:55:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:55:43 - pico-train - INFO - Step 64625 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:55:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8215
2025-08-30 09:55:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.61e-05
2025-08-30 09:55:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:55:56 - pico-train - INFO - Step 64650 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:55:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8024
2025-08-30 09:55:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.61e-05
2025-08-30 09:55:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:56:08 - pico-train - INFO - Step 64675 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:56:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8857
2025-08-30 09:56:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.61e-05
2025-08-30 09:56:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:56:21 - pico-train - INFO - Step 64700 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:56:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7671
2025-08-30 09:56:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.61e-05
2025-08-30 09:56:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:56:34 - pico-train - INFO - Step 64725 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:56:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8027
2025-08-30 09:56:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-05
2025-08-30 09:56:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:56:47 - pico-train - INFO - Step 64750 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:56:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8995
2025-08-30 09:56:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-05
2025-08-30 09:56:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:56:59 - pico-train - INFO - Step 64775 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:56:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7634
2025-08-30 09:56:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-05
2025-08-30 09:56:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:57:12 - pico-train - INFO - Step 64800 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:57:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8010
2025-08-30 09:57:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-05
2025-08-30 09:57:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:57:25 - pico-train - INFO - Step 64825 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:57:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7916
2025-08-30 09:57:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.60e-05
2025-08-30 09:57:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:57:38 - pico-train - INFO - Step 64850 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:57:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7833
2025-08-30 09:57:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.59e-05
2025-08-30 09:57:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:57:50 - pico-train - INFO - Step 64875 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:57:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8170
2025-08-30 09:57:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.59e-05
2025-08-30 09:57:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:58:03 - pico-train - INFO - Step 64900 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:58:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8529
2025-08-30 09:58:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.59e-05
2025-08-30 09:58:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:58:16 - pico-train - INFO - Step 64925 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:58:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8294
2025-08-30 09:58:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.59e-05
2025-08-30 09:58:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:58:28 - pico-train - INFO - Step 64950 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:58:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8264
2025-08-30 09:58:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.59e-05
2025-08-30 09:58:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:58:41 - pico-train - INFO - Step 64975 -- ๐Ÿ”„ Training Metrics
2025-08-30 09:58:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7959
2025-08-30 09:58:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.58e-05
2025-08-30 09:58:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 09:58:54 - pico-train - INFO - Step 65000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 10:00:52 - pico-train - INFO - Step 65000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 10:00:52 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.9165716287373382e+31
2025-08-30 10:00:55 - pico-train - INFO - Step 65000 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:00:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8632
2025-08-30 10:00:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.58e-05
2025-08-30 10:00:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:00:55 - pico-train - INFO - Step 65000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 10:01:10 - pico-train - INFO - Step 65025 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:01:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8177
2025-08-30 10:01:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.58e-05
2025-08-30 10:01:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:01:23 - pico-train - INFO - Step 65050 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:01:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7954
2025-08-30 10:01:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.58e-05
2025-08-30 10:01:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:01:35 - pico-train - INFO - Step 65075 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:01:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7900
2025-08-30 10:01:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.58e-05
2025-08-30 10:01:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:01:48 - pico-train - INFO - Step 65100 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:01:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8748
2025-08-30 10:01:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.57e-05
2025-08-30 10:01:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:02:00 - pico-train - INFO - Step 65125 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:02:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8848
2025-08-30 10:02:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.57e-05
2025-08-30 10:02:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:02:13 - pico-train - INFO - Step 65150 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:02:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8230
2025-08-30 10:02:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.57e-05
2025-08-30 10:02:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:02:26 - pico-train - INFO - Step 65175 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:02:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8187
2025-08-30 10:02:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.57e-05
2025-08-30 10:02:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:02:38 - pico-train - INFO - Step 65200 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:02:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7594
2025-08-30 10:02:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.57e-05
2025-08-30 10:02:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:02:51 - pico-train - INFO - Step 65225 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:02:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8269
2025-08-30 10:02:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.57e-05
2025-08-30 10:02:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:03:03 - pico-train - INFO - Step 65250 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:03:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8085
2025-08-30 10:03:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.56e-05
2025-08-30 10:03:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:03:16 - pico-train - INFO - Step 65275 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:03:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7563
2025-08-30 10:03:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.56e-05
2025-08-30 10:03:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:03:29 - pico-train - INFO - Step 65300 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:03:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8133
2025-08-30 10:03:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.56e-05
2025-08-30 10:03:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:03:41 - pico-train - INFO - Step 65325 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:03:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8193
2025-08-30 10:03:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.56e-05
2025-08-30 10:03:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:03:54 - pico-train - INFO - Step 65350 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:03:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8060
2025-08-30 10:03:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.56e-05
2025-08-30 10:03:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:04:06 - pico-train - INFO - Step 65375 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:04:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8249
2025-08-30 10:04:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.55e-05
2025-08-30 10:04:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:04:19 - pico-train - INFO - Step 65400 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:04:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8455
2025-08-30 10:04:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.55e-05
2025-08-30 10:04:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:04:32 - pico-train - INFO - Step 65425 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:04:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8625
2025-08-30 10:04:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.55e-05
2025-08-30 10:04:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:04:44 - pico-train - INFO - Step 65450 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:04:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8366
2025-08-30 10:04:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.55e-05
2025-08-30 10:04:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:04:57 - pico-train - INFO - Step 65475 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:04:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8005
2025-08-30 10:04:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.55e-05
2025-08-30 10:04:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:05:09 - pico-train - INFO - Step 65500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 10:07:04 - pico-train - INFO - Step 65500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 10:07:04 - pico-train - INFO - โ””โ”€โ”€ paloma: 1.8707850216569984e+31
2025-08-30 10:07:06 - pico-train - INFO - Step 65500 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:07:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8969
2025-08-30 10:07:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.54e-05
2025-08-30 10:07:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:07:06 - pico-train - INFO - Step 65500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 10:07:21 - pico-train - INFO - Step 65525 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:07:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8361
2025-08-30 10:07:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.54e-05
2025-08-30 10:07:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:07:33 - pico-train - INFO - Step 65550 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:07:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8304
2025-08-30 10:07:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.54e-05
2025-08-30 10:07:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:07:46 - pico-train - INFO - Step 65575 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:07:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8668
2025-08-30 10:07:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.54e-05
2025-08-30 10:07:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:07:59 - pico-train - INFO - Step 65600 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:07:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8797
2025-08-30 10:07:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.54e-05
2025-08-30 10:07:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:08:12 - pico-train - INFO - Step 65625 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:08:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8747
2025-08-30 10:08:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.53e-05
2025-08-30 10:08:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:08:25 - pico-train - INFO - Step 65650 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:08:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8350
2025-08-30 10:08:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.53e-05
2025-08-30 10:08:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:08:37 - pico-train - INFO - Step 65675 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:08:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8606
2025-08-30 10:08:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.53e-05
2025-08-30 10:08:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:08:50 - pico-train - INFO - Step 65700 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:08:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8106
2025-08-30 10:08:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.53e-05
2025-08-30 10:08:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:09:02 - pico-train - INFO - Step 65725 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:09:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9222
2025-08-30 10:09:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.53e-05
2025-08-30 10:09:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:09:15 - pico-train - INFO - Step 65750 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:09:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8246
2025-08-30 10:09:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.52e-05
2025-08-30 10:09:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:09:27 - pico-train - INFO - Step 65775 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:09:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8507
2025-08-30 10:09:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.52e-05
2025-08-30 10:09:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:09:40 - pico-train - INFO - Step 65800 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:09:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8379
2025-08-30 10:09:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.52e-05
2025-08-30 10:09:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:09:53 - pico-train - INFO - Step 65825 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:09:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8610
2025-08-30 10:09:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.52e-05
2025-08-30 10:09:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:10:05 - pico-train - INFO - Step 65850 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:10:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8496
2025-08-30 10:10:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.52e-05
2025-08-30 10:10:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:10:18 - pico-train - INFO - Step 65875 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:10:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8066
2025-08-30 10:10:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.51e-05
2025-08-30 10:10:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:10:30 - pico-train - INFO - Step 65900 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:10:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8117
2025-08-30 10:10:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.51e-05
2025-08-30 10:10:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:10:43 - pico-train - INFO - Step 65925 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:10:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7019
2025-08-30 10:10:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.51e-05
2025-08-30 10:10:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:10:56 - pico-train - INFO - Step 65950 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:10:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8699
2025-08-30 10:10:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.51e-05
2025-08-30 10:10:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:11:08 - pico-train - INFO - Step 65975 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:11:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8359
2025-08-30 10:11:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.51e-05
2025-08-30 10:11:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:11:20 - pico-train - INFO - Step 66000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 10:13:16 - pico-train - INFO - Step 66000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 10:13:16 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.5231045927678714e+31
2025-08-30 10:13:18 - pico-train - INFO - Step 66000 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:13:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8326
2025-08-30 10:13:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-05
2025-08-30 10:13:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:13:18 - pico-train - INFO - Step 66000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 10:13:33 - pico-train - INFO - Step 66025 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:13:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7993
2025-08-30 10:13:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-05
2025-08-30 10:13:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:13:45 - pico-train - INFO - Step 66050 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:13:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7906
2025-08-30 10:13:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-05
2025-08-30 10:13:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:13:58 - pico-train - INFO - Step 66075 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:13:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8668
2025-08-30 10:13:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-05
2025-08-30 10:13:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:14:11 - pico-train - INFO - Step 66100 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:14:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7929
2025-08-30 10:14:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.50e-05
2025-08-30 10:14:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:14:24 - pico-train - INFO - Step 66125 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:14:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8483
2025-08-30 10:14:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.49e-05
2025-08-30 10:14:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:14:37 - pico-train - INFO - Step 66150 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:14:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8747
2025-08-30 10:14:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.49e-05
2025-08-30 10:14:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:14:49 - pico-train - INFO - Step 66175 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:14:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7636
2025-08-30 10:14:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.49e-05
2025-08-30 10:14:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:15:02 - pico-train - INFO - Step 66200 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:15:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6910
2025-08-30 10:15:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.49e-05
2025-08-30 10:15:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:15:14 - pico-train - INFO - Step 66225 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:15:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7696
2025-08-30 10:15:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.49e-05
2025-08-30 10:15:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:15:27 - pico-train - INFO - Step 66250 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:15:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8958
2025-08-30 10:15:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.48e-05
2025-08-30 10:15:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:15:40 - pico-train - INFO - Step 66275 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:15:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8720
2025-08-30 10:15:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.48e-05
2025-08-30 10:15:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:15:53 - pico-train - INFO - Step 66300 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:15:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7927
2025-08-30 10:15:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.48e-05
2025-08-30 10:15:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:16:05 - pico-train - INFO - Step 66325 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:16:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7417
2025-08-30 10:16:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.48e-05
2025-08-30 10:16:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:16:18 - pico-train - INFO - Step 66350 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:16:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7908
2025-08-30 10:16:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.48e-05
2025-08-30 10:16:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:16:31 - pico-train - INFO - Step 66375 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:16:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8609
2025-08-30 10:16:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.47e-05
2025-08-30 10:16:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:16:43 - pico-train - INFO - Step 66400 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:16:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7846
2025-08-30 10:16:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.47e-05
2025-08-30 10:16:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:16:56 - pico-train - INFO - Step 66425 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:16:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7744
2025-08-30 10:16:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.47e-05
2025-08-30 10:16:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:17:09 - pico-train - INFO - Step 66450 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:17:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7639
2025-08-30 10:17:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.47e-05
2025-08-30 10:17:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:17:22 - pico-train - INFO - Step 66475 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:17:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8572
2025-08-30 10:17:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.47e-05
2025-08-30 10:17:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:17:34 - pico-train - INFO - Step 66500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 10:19:32 - pico-train - INFO - Step 66500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 10:19:32 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.557649624835569e+31
2025-08-30 10:19:34 - pico-train - INFO - Step 66500 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:19:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7731
2025-08-30 10:19:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.46e-05
2025-08-30 10:19:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:19:34 - pico-train - INFO - Step 66500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 10:19:49 - pico-train - INFO - Step 66525 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:19:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8698
2025-08-30 10:19:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.46e-05
2025-08-30 10:19:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:20:01 - pico-train - INFO - Step 66550 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:20:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7763
2025-08-30 10:20:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.46e-05
2025-08-30 10:20:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:20:14 - pico-train - INFO - Step 66575 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:20:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7793
2025-08-30 10:20:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.46e-05
2025-08-30 10:20:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:20:27 - pico-train - INFO - Step 66600 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:20:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8998
2025-08-30 10:20:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.46e-05
2025-08-30 10:20:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:20:40 - pico-train - INFO - Step 66625 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:20:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8772
2025-08-30 10:20:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.46e-05
2025-08-30 10:20:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:20:53 - pico-train - INFO - Step 66650 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:20:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7580
2025-08-30 10:20:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.45e-05
2025-08-30 10:20:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:21:05 - pico-train - INFO - Step 66675 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:21:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8102
2025-08-30 10:21:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.45e-05
2025-08-30 10:21:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:21:18 - pico-train - INFO - Step 66700 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:21:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8321
2025-08-30 10:21:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.45e-05
2025-08-30 10:21:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:21:30 - pico-train - INFO - Step 66725 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:21:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7792
2025-08-30 10:21:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.45e-05
2025-08-30 10:21:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:21:43 - pico-train - INFO - Step 66750 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:21:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7811
2025-08-30 10:21:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.45e-05
2025-08-30 10:21:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:21:56 - pico-train - INFO - Step 66775 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:21:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7789
2025-08-30 10:21:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.44e-05
2025-08-30 10:21:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:22:08 - pico-train - INFO - Step 66800 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:22:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7714
2025-08-30 10:22:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.44e-05
2025-08-30 10:22:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:22:21 - pico-train - INFO - Step 66825 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:22:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8399
2025-08-30 10:22:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.44e-05
2025-08-30 10:22:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:22:34 - pico-train - INFO - Step 66850 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:22:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7693
2025-08-30 10:22:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.44e-05
2025-08-30 10:22:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:22:46 - pico-train - INFO - Step 66875 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:22:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8165
2025-08-30 10:22:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.44e-05
2025-08-30 10:22:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:22:59 - pico-train - INFO - Step 66900 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:22:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7763
2025-08-30 10:22:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.43e-05
2025-08-30 10:22:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:23:11 - pico-train - INFO - Step 66925 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:23:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8683
2025-08-30 10:23:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.43e-05
2025-08-30 10:23:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:23:24 - pico-train - INFO - Step 66950 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:23:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8662
2025-08-30 10:23:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.43e-05
2025-08-30 10:23:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:23:37 - pico-train - INFO - Step 66975 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:23:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8864
2025-08-30 10:23:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.43e-05
2025-08-30 10:23:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:23:49 - pico-train - INFO - Step 67000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 10:25:56 - pico-train - INFO - Step 67000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 10:25:56 - pico-train - INFO - โ””โ”€โ”€ paloma: 2.6865032383433974e+31
2025-08-30 10:25:58 - pico-train - INFO - Step 67000 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:25:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7555
2025-08-30 10:25:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.43e-05
2025-08-30 10:25:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:25:58 - pico-train - INFO - Step 67000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 10:26:12 - pico-train - INFO - Step 67025 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:26:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8167
2025-08-30 10:26:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.42e-05
2025-08-30 10:26:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:26:25 - pico-train - INFO - Step 67050 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:26:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8101
2025-08-30 10:26:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.42e-05
2025-08-30 10:26:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:26:38 - pico-train - INFO - Step 67075 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:26:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8146
2025-08-30 10:26:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.42e-05
2025-08-30 10:26:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:26:51 - pico-train - INFO - Step 67100 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:26:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9005
2025-08-30 10:26:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.42e-05
2025-08-30 10:26:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:27:04 - pico-train - INFO - Step 67125 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:27:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7768
2025-08-30 10:27:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.42e-05
2025-08-30 10:27:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:27:16 - pico-train - INFO - Step 67150 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:27:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7152
2025-08-30 10:27:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.41e-05
2025-08-30 10:27:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:27:29 - pico-train - INFO - Step 67175 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:27:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8443
2025-08-30 10:27:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.41e-05
2025-08-30 10:27:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:27:41 - pico-train - INFO - Step 67200 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:27:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7907
2025-08-30 10:27:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.41e-05
2025-08-30 10:27:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:27:54 - pico-train - INFO - Step 67225 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:27:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8160
2025-08-30 10:27:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.41e-05
2025-08-30 10:27:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:28:06 - pico-train - INFO - Step 67250 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:28:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8334
2025-08-30 10:28:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.41e-05
2025-08-30 10:28:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:28:19 - pico-train - INFO - Step 67275 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:28:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8201
2025-08-30 10:28:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.41e-05
2025-08-30 10:28:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:28:32 - pico-train - INFO - Step 67300 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:28:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8962
2025-08-30 10:28:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-05
2025-08-30 10:28:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:28:44 - pico-train - INFO - Step 67325 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:28:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7876
2025-08-30 10:28:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-05
2025-08-30 10:28:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:28:57 - pico-train - INFO - Step 67350 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:28:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8093
2025-08-30 10:28:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-05
2025-08-30 10:28:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:29:09 - pico-train - INFO - Step 67375 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:29:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7282
2025-08-30 10:29:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-05
2025-08-30 10:29:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:29:22 - pico-train - INFO - Step 67400 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:29:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7584
2025-08-30 10:29:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.40e-05
2025-08-30 10:29:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:29:34 - pico-train - INFO - Step 67425 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:29:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7801
2025-08-30 10:29:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.39e-05
2025-08-30 10:29:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:29:47 - pico-train - INFO - Step 67450 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:29:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7262
2025-08-30 10:29:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.39e-05
2025-08-30 10:29:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:30:00 - pico-train - INFO - Step 67475 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:30:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7496
2025-08-30 10:30:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.39e-05
2025-08-30 10:30:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:30:12 - pico-train - INFO - Step 67500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 10:32:10 - pico-train - INFO - Step 67500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 10:32:10 - pico-train - INFO - โ””โ”€โ”€ paloma: 3.1065040652754565e+31
2025-08-30 10:32:13 - pico-train - INFO - Step 67500 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:32:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7965
2025-08-30 10:32:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.39e-05
2025-08-30 10:32:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:32:13 - pico-train - INFO - Step 67500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 10:32:27 - pico-train - INFO - Step 67525 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:32:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8326
2025-08-30 10:32:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.39e-05
2025-08-30 10:32:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:32:40 - pico-train - INFO - Step 67550 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:32:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8544
2025-08-30 10:32:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.38e-05
2025-08-30 10:32:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:32:53 - pico-train - INFO - Step 67575 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:32:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8529
2025-08-30 10:32:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.38e-05
2025-08-30 10:32:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:33:06 - pico-train - INFO - Step 67600 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:33:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7630
2025-08-30 10:33:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.38e-05
2025-08-30 10:33:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:33:19 - pico-train - INFO - Step 67625 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:33:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8400
2025-08-30 10:33:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.38e-05
2025-08-30 10:33:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:33:31 - pico-train - INFO - Step 67650 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:33:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6921
2025-08-30 10:33:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.38e-05
2025-08-30 10:33:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:33:44 - pico-train - INFO - Step 67675 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:33:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7714
2025-08-30 10:33:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.37e-05
2025-08-30 10:33:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:33:57 - pico-train - INFO - Step 67700 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:33:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8415
2025-08-30 10:33:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.37e-05
2025-08-30 10:33:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:34:09 - pico-train - INFO - Step 67725 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:34:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7966
2025-08-30 10:34:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.37e-05
2025-08-30 10:34:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:34:22 - pico-train - INFO - Step 67750 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:34:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7681
2025-08-30 10:34:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.37e-05
2025-08-30 10:34:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:34:34 - pico-train - INFO - Step 67775 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:34:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8142
2025-08-30 10:34:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.37e-05
2025-08-30 10:34:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:34:47 - pico-train - INFO - Step 67800 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:34:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8364
2025-08-30 10:34:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.37e-05
2025-08-30 10:34:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:35:00 - pico-train - INFO - Step 67825 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:35:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7471
2025-08-30 10:35:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.36e-05
2025-08-30 10:35:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:35:12 - pico-train - INFO - Step 67850 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:35:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7829
2025-08-30 10:35:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.36e-05
2025-08-30 10:35:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:35:25 - pico-train - INFO - Step 67875 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:35:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7502
2025-08-30 10:35:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.36e-05
2025-08-30 10:35:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:35:38 - pico-train - INFO - Step 67900 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:35:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8291
2025-08-30 10:35:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.36e-05
2025-08-30 10:35:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:35:50 - pico-train - INFO - Step 67925 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:35:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8411
2025-08-30 10:35:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.36e-05
2025-08-30 10:35:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:36:03 - pico-train - INFO - Step 67950 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:36:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8542
2025-08-30 10:36:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.35e-05
2025-08-30 10:36:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:36:16 - pico-train - INFO - Step 67975 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:36:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9065
2025-08-30 10:36:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.35e-05
2025-08-30 10:36:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:36:28 - pico-train - INFO - Step 68000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 10:38:32 - pico-train - INFO - Step 68000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 10:38:32 - pico-train - INFO - โ””โ”€โ”€ paloma: 3.3702997728095594e+31
2025-08-30 10:38:35 - pico-train - INFO - Step 68000 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:38:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7845
2025-08-30 10:38:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.35e-05
2025-08-30 10:38:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:38:35 - pico-train - INFO - Step 68000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 10:38:50 - pico-train - INFO - Step 68025 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:38:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6880
2025-08-30 10:38:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.35e-05
2025-08-30 10:38:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:39:02 - pico-train - INFO - Step 68050 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:39:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7669
2025-08-30 10:39:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.35e-05
2025-08-30 10:39:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:39:15 - pico-train - INFO - Step 68075 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:39:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7084
2025-08-30 10:39:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.34e-05
2025-08-30 10:39:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:39:28 - pico-train - INFO - Step 68100 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:39:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8807
2025-08-30 10:39:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.34e-05
2025-08-30 10:39:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:39:41 - pico-train - INFO - Step 68125 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:39:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8497
2025-08-30 10:39:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.34e-05
2025-08-30 10:39:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:39:54 - pico-train - INFO - Step 68150 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:39:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7487
2025-08-30 10:39:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.34e-05
2025-08-30 10:39:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:40:06 - pico-train - INFO - Step 68175 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:40:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7784
2025-08-30 10:40:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.34e-05
2025-08-30 10:40:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:40:19 - pico-train - INFO - Step 68200 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:40:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7622
2025-08-30 10:40:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.33e-05
2025-08-30 10:40:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:40:31 - pico-train - INFO - Step 68225 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:40:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7823
2025-08-30 10:40:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.33e-05
2025-08-30 10:40:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:40:44 - pico-train - INFO - Step 68250 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:40:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7689
2025-08-30 10:40:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.33e-05
2025-08-30 10:40:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:40:57 - pico-train - INFO - Step 68275 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:40:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7719
2025-08-30 10:40:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.33e-05
2025-08-30 10:40:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:41:09 - pico-train - INFO - Step 68300 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:41:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7754
2025-08-30 10:41:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.33e-05
2025-08-30 10:41:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:41:22 - pico-train - INFO - Step 68325 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:41:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8183
2025-08-30 10:41:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.33e-05
2025-08-30 10:41:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:41:34 - pico-train - INFO - Step 68350 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:41:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8116
2025-08-30 10:41:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.32e-05
2025-08-30 10:41:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:41:47 - pico-train - INFO - Step 68375 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:41:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6714
2025-08-30 10:41:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.32e-05
2025-08-30 10:41:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:42:00 - pico-train - INFO - Step 68400 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:42:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7859
2025-08-30 10:42:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.32e-05
2025-08-30 10:42:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:42:12 - pico-train - INFO - Step 68425 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:42:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8268
2025-08-30 10:42:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.32e-05
2025-08-30 10:42:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:42:25 - pico-train - INFO - Step 68450 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:42:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8194
2025-08-30 10:42:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.32e-05
2025-08-30 10:42:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:42:37 - pico-train - INFO - Step 68475 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:42:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8550
2025-08-30 10:42:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.31e-05
2025-08-30 10:42:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:42:49 - pico-train - INFO - Step 68500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 10:45:00 - pico-train - INFO - Step 68500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 10:45:00 - pico-train - INFO - โ””โ”€โ”€ paloma: 3.3728195138741334e+31
2025-08-30 10:45:04 - pico-train - INFO - Step 68500 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:45:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9096
2025-08-30 10:45:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.31e-05
2025-08-30 10:45:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:45:04 - pico-train - INFO - Step 68500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 10:45:19 - pico-train - INFO - Step 68525 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:45:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7826
2025-08-30 10:45:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.31e-05
2025-08-30 10:45:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:45:32 - pico-train - INFO - Step 68550 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:45:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7860
2025-08-30 10:45:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.31e-05
2025-08-30 10:45:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:45:45 - pico-train - INFO - Step 68575 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:45:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7932
2025-08-30 10:45:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.31e-05
2025-08-30 10:45:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:45:58 - pico-train - INFO - Step 68600 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:45:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8207
2025-08-30 10:45:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.30e-05
2025-08-30 10:45:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:46:10 - pico-train - INFO - Step 68625 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:46:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6706
2025-08-30 10:46:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.30e-05
2025-08-30 10:46:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:46:23 - pico-train - INFO - Step 68650 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:46:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7751
2025-08-30 10:46:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.30e-05
2025-08-30 10:46:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:46:36 - pico-train - INFO - Step 68675 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:46:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7419
2025-08-30 10:46:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.30e-05
2025-08-30 10:46:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:46:48 - pico-train - INFO - Step 68700 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:46:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8879
2025-08-30 10:46:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.30e-05
2025-08-30 10:46:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:47:01 - pico-train - INFO - Step 68725 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:47:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8349
2025-08-30 10:47:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.30e-05
2025-08-30 10:47:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:47:13 - pico-train - INFO - Step 68750 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:47:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8237
2025-08-30 10:47:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.29e-05
2025-08-30 10:47:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:47:26 - pico-train - INFO - Step 68775 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:47:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8724
2025-08-30 10:47:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.29e-05
2025-08-30 10:47:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:47:39 - pico-train - INFO - Step 68800 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:47:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7777
2025-08-30 10:47:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.29e-05
2025-08-30 10:47:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:47:52 - pico-train - INFO - Step 68825 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:47:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7775
2025-08-30 10:47:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.29e-05
2025-08-30 10:47:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:48:04 - pico-train - INFO - Step 68850 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:48:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8112
2025-08-30 10:48:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.29e-05
2025-08-30 10:48:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:48:17 - pico-train - INFO - Step 68875 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:48:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7673
2025-08-30 10:48:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.28e-05
2025-08-30 10:48:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:48:30 - pico-train - INFO - Step 68900 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:48:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7477
2025-08-30 10:48:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.28e-05
2025-08-30 10:48:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:48:43 - pico-train - INFO - Step 68925 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:48:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8516
2025-08-30 10:48:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.28e-05
2025-08-30 10:48:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:48:56 - pico-train - INFO - Step 68950 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:48:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7671
2025-08-30 10:48:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.28e-05
2025-08-30 10:48:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:49:08 - pico-train - INFO - Step 68975 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:49:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8476
2025-08-30 10:49:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.28e-05
2025-08-30 10:49:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:49:20 - pico-train - INFO - Step 69000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 10:51:18 - pico-train - INFO - Step 69000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 10:51:18 - pico-train - INFO - โ””โ”€โ”€ paloma: 4.015441614691927e+31
2025-08-30 10:51:22 - pico-train - INFO - Step 69000 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:51:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7945
2025-08-30 10:51:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.27e-05
2025-08-30 10:51:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:51:22 - pico-train - INFO - Step 69000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 10:51:38 - pico-train - INFO - Step 69025 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:51:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7222
2025-08-30 10:51:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.27e-05
2025-08-30 10:51:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:51:50 - pico-train - INFO - Step 69050 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:51:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8469
2025-08-30 10:51:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.27e-05
2025-08-30 10:51:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:52:03 - pico-train - INFO - Step 69075 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:52:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7888
2025-08-30 10:52:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.27e-05
2025-08-30 10:52:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:52:16 - pico-train - INFO - Step 69100 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:52:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8239
2025-08-30 10:52:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.27e-05
2025-08-30 10:52:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:52:29 - pico-train - INFO - Step 69125 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:52:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8123
2025-08-30 10:52:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.27e-05
2025-08-30 10:52:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:52:42 - pico-train - INFO - Step 69150 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:52:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8655
2025-08-30 10:52:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.26e-05
2025-08-30 10:52:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:52:54 - pico-train - INFO - Step 69175 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:52:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8294
2025-08-30 10:52:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.26e-05
2025-08-30 10:52:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:53:07 - pico-train - INFO - Step 69200 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:53:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8492
2025-08-30 10:53:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.26e-05
2025-08-30 10:53:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:53:19 - pico-train - INFO - Step 69225 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:53:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8203
2025-08-30 10:53:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.26e-05
2025-08-30 10:53:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:53:32 - pico-train - INFO - Step 69250 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:53:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8163
2025-08-30 10:53:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.26e-05
2025-08-30 10:53:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:53:45 - pico-train - INFO - Step 69275 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:53:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8982
2025-08-30 10:53:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.25e-05
2025-08-30 10:53:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:53:57 - pico-train - INFO - Step 69300 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:53:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7549
2025-08-30 10:53:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.25e-05
2025-08-30 10:53:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:54:10 - pico-train - INFO - Step 69325 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:54:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8212
2025-08-30 10:54:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.25e-05
2025-08-30 10:54:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:54:23 - pico-train - INFO - Step 69350 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:54:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8512
2025-08-30 10:54:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.25e-05
2025-08-30 10:54:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:54:35 - pico-train - INFO - Step 69375 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:54:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8506
2025-08-30 10:54:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.25e-05
2025-08-30 10:54:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:54:48 - pico-train - INFO - Step 69400 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:54:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7973
2025-08-30 10:54:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.25e-05
2025-08-30 10:54:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:55:00 - pico-train - INFO - Step 69425 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:55:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8587
2025-08-30 10:55:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.24e-05
2025-08-30 10:55:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:55:13 - pico-train - INFO - Step 69450 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:55:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7108
2025-08-30 10:55:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.24e-05
2025-08-30 10:55:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:55:26 - pico-train - INFO - Step 69475 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:55:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7860
2025-08-30 10:55:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.24e-05
2025-08-30 10:55:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:55:38 - pico-train - INFO - Step 69500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 10:57:37 - pico-train - INFO - Step 69500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 10:57:37 - pico-train - INFO - โ””โ”€โ”€ paloma: 4.498437349495611e+31
2025-08-30 10:57:40 - pico-train - INFO - Step 69500 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:57:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8497
2025-08-30 10:57:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.24e-05
2025-08-30 10:57:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:57:40 - pico-train - INFO - Step 69500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 10:57:55 - pico-train - INFO - Step 69525 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:57:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8320
2025-08-30 10:57:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.24e-05
2025-08-30 10:57:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:58:08 - pico-train - INFO - Step 69550 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:58:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7277
2025-08-30 10:58:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.23e-05
2025-08-30 10:58:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:58:20 - pico-train - INFO - Step 69575 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:58:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8119
2025-08-30 10:58:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.23e-05
2025-08-30 10:58:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:58:34 - pico-train - INFO - Step 69600 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:58:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8142
2025-08-30 10:58:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.23e-05
2025-08-30 10:58:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:58:47 - pico-train - INFO - Step 69625 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:58:47 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8271
2025-08-30 10:58:47 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.23e-05
2025-08-30 10:58:47 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:58:59 - pico-train - INFO - Step 69650 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:58:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7488
2025-08-30 10:58:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.23e-05
2025-08-30 10:58:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:59:12 - pico-train - INFO - Step 69675 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:59:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8036
2025-08-30 10:59:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.22e-05
2025-08-30 10:59:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:59:25 - pico-train - INFO - Step 69700 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:59:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8718
2025-08-30 10:59:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.22e-05
2025-08-30 10:59:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:59:37 - pico-train - INFO - Step 69725 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:59:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7624
2025-08-30 10:59:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.22e-05
2025-08-30 10:59:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 10:59:50 - pico-train - INFO - Step 69750 -- ๐Ÿ”„ Training Metrics
2025-08-30 10:59:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7221
2025-08-30 10:59:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.22e-05
2025-08-30 10:59:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:00:03 - pico-train - INFO - Step 69775 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:00:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8421
2025-08-30 11:00:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.22e-05
2025-08-30 11:00:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:00:16 - pico-train - INFO - Step 69800 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:00:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8152
2025-08-30 11:00:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.22e-05
2025-08-30 11:00:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:00:28 - pico-train - INFO - Step 69825 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:00:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8357
2025-08-30 11:00:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.21e-05
2025-08-30 11:00:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:00:41 - pico-train - INFO - Step 69850 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:00:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8124
2025-08-30 11:00:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.21e-05
2025-08-30 11:00:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:00:53 - pico-train - INFO - Step 69875 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:00:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8160
2025-08-30 11:00:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.21e-05
2025-08-30 11:00:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:01:06 - pico-train - INFO - Step 69900 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:01:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7780
2025-08-30 11:01:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.21e-05
2025-08-30 11:01:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:01:19 - pico-train - INFO - Step 69925 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:01:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7680
2025-08-30 11:01:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.21e-05
2025-08-30 11:01:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:01:31 - pico-train - INFO - Step 69950 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:01:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7678
2025-08-30 11:01:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.20e-05
2025-08-30 11:01:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:01:44 - pico-train - INFO - Step 69975 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:01:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7694
2025-08-30 11:01:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.20e-05
2025-08-30 11:01:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:01:56 - pico-train - INFO - Step 70000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 11:03:54 - pico-train - INFO - Step 70000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 11:03:54 - pico-train - INFO - โ””โ”€โ”€ paloma: 4.524086501230947e+31
2025-08-30 11:03:58 - pico-train - INFO - Step 70000 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:03:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7691
2025-08-30 11:03:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.20e-05
2025-08-30 11:03:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:03:58 - pico-train - INFO - Step 70000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 11:04:13 - pico-train - INFO - Step 70025 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:04:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8459
2025-08-30 11:04:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.20e-05
2025-08-30 11:04:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:04:26 - pico-train - INFO - Step 70050 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:04:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7648
2025-08-30 11:04:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.20e-05
2025-08-30 11:04:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:04:38 - pico-train - INFO - Step 70075 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:04:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9146
2025-08-30 11:04:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.20e-05
2025-08-30 11:04:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:04:51 - pico-train - INFO - Step 70100 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:04:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8547
2025-08-30 11:04:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.19e-05
2025-08-30 11:04:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:05:04 - pico-train - INFO - Step 70125 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:05:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7720
2025-08-30 11:05:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.19e-05
2025-08-30 11:05:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:05:17 - pico-train - INFO - Step 70150 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:05:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7761
2025-08-30 11:05:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.19e-05
2025-08-30 11:05:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:05:29 - pico-train - INFO - Step 70175 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:05:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7980
2025-08-30 11:05:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.19e-05
2025-08-30 11:05:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:05:42 - pico-train - INFO - Step 70200 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:05:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7824
2025-08-30 11:05:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.19e-05
2025-08-30 11:05:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:05:55 - pico-train - INFO - Step 70225 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:05:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8025
2025-08-30 11:05:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.18e-05
2025-08-30 11:05:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:06:07 - pico-train - INFO - Step 70250 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:06:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8501
2025-08-30 11:06:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.18e-05
2025-08-30 11:06:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:06:20 - pico-train - INFO - Step 70275 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:06:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7877
2025-08-30 11:06:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.18e-05
2025-08-30 11:06:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:06:32 - pico-train - INFO - Step 70300 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:06:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7537
2025-08-30 11:06:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.18e-05
2025-08-30 11:06:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:06:45 - pico-train - INFO - Step 70325 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:06:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8530
2025-08-30 11:06:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.18e-05
2025-08-30 11:06:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:06:58 - pico-train - INFO - Step 70350 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:06:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6919
2025-08-30 11:06:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.18e-05
2025-08-30 11:06:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:07:10 - pico-train - INFO - Step 70375 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:07:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7595
2025-08-30 11:07:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.17e-05
2025-08-30 11:07:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:07:23 - pico-train - INFO - Step 70400 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:07:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7637
2025-08-30 11:07:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.17e-05
2025-08-30 11:07:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:07:36 - pico-train - INFO - Step 70425 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:07:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8013
2025-08-30 11:07:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.17e-05
2025-08-30 11:07:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:07:48 - pico-train - INFO - Step 70450 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:07:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8487
2025-08-30 11:07:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.17e-05
2025-08-30 11:07:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:08:01 - pico-train - INFO - Step 70475 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:08:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7931
2025-08-30 11:08:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.17e-05
2025-08-30 11:08:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:08:13 - pico-train - INFO - Step 70500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 11:10:14 - pico-train - INFO - Step 70500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 11:10:14 - pico-train - INFO - โ””โ”€โ”€ paloma: 5.389143520871013e+31
2025-08-30 11:10:18 - pico-train - INFO - Step 70500 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:10:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8130
2025-08-30 11:10:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.16e-05
2025-08-30 11:10:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:10:18 - pico-train - INFO - Step 70500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 11:10:34 - pico-train - INFO - Step 70525 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:10:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8003
2025-08-30 11:10:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.16e-05
2025-08-30 11:10:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:10:46 - pico-train - INFO - Step 70550 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:10:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7638
2025-08-30 11:10:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.16e-05
2025-08-30 11:10:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:10:59 - pico-train - INFO - Step 70575 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:10:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8081
2025-08-30 11:10:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.16e-05
2025-08-30 11:10:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:11:12 - pico-train - INFO - Step 70600 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:11:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8433
2025-08-30 11:11:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.16e-05
2025-08-30 11:11:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:11:24 - pico-train - INFO - Step 70625 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:11:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7845
2025-08-30 11:11:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.16e-05
2025-08-30 11:11:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:11:37 - pico-train - INFO - Step 70650 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:11:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7766
2025-08-30 11:11:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.15e-05
2025-08-30 11:11:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:11:50 - pico-train - INFO - Step 70675 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:11:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8443
2025-08-30 11:11:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.15e-05
2025-08-30 11:11:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:12:02 - pico-train - INFO - Step 70700 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:12:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8557
2025-08-30 11:12:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.15e-05
2025-08-30 11:12:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:12:16 - pico-train - INFO - Step 70725 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:12:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7753
2025-08-30 11:12:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.15e-05
2025-08-30 11:12:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:12:28 - pico-train - INFO - Step 70750 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:12:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7036
2025-08-30 11:12:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.15e-05
2025-08-30 11:12:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:12:41 - pico-train - INFO - Step 70775 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:12:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8355
2025-08-30 11:12:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.14e-05
2025-08-30 11:12:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:12:54 - pico-train - INFO - Step 70800 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:12:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7925
2025-08-30 11:12:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.14e-05
2025-08-30 11:12:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:13:06 - pico-train - INFO - Step 70825 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:13:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7594
2025-08-30 11:13:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.14e-05
2025-08-30 11:13:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:13:19 - pico-train - INFO - Step 70850 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:13:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7899
2025-08-30 11:13:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.14e-05
2025-08-30 11:13:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:13:31 - pico-train - INFO - Step 70875 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:13:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8210
2025-08-30 11:13:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.14e-05
2025-08-30 11:13:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:13:44 - pico-train - INFO - Step 70900 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:13:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7877
2025-08-30 11:13:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.14e-05
2025-08-30 11:13:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:13:57 - pico-train - INFO - Step 70925 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:13:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8528
2025-08-30 11:13:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.13e-05
2025-08-30 11:13:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:14:09 - pico-train - INFO - Step 70950 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:14:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7071
2025-08-30 11:14:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.13e-05
2025-08-30 11:14:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:14:22 - pico-train - INFO - Step 70975 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:14:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7500
2025-08-30 11:14:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.13e-05
2025-08-30 11:14:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:14:34 - pico-train - INFO - Step 71000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 11:16:34 - pico-train - INFO - Step 71000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 11:16:34 - pico-train - INFO - โ””โ”€โ”€ paloma: 6.106796255447029e+31
2025-08-30 11:16:38 - pico-train - INFO - Step 71000 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:16:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8512
2025-08-30 11:16:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.13e-05
2025-08-30 11:16:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:16:38 - pico-train - INFO - Step 71000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 11:16:53 - pico-train - INFO - Step 71025 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:16:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7849
2025-08-30 11:16:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.13e-05
2025-08-30 11:16:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:17:06 - pico-train - INFO - Step 71050 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:17:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7794
2025-08-30 11:17:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.13e-05
2025-08-30 11:17:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:17:19 - pico-train - INFO - Step 71075 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:17:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8584
2025-08-30 11:17:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.12e-05
2025-08-30 11:17:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:17:32 - pico-train - INFO - Step 71100 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:17:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7866
2025-08-30 11:17:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.12e-05
2025-08-30 11:17:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:17:44 - pico-train - INFO - Step 71125 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:17:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7744
2025-08-30 11:17:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.12e-05
2025-08-30 11:17:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:17:57 - pico-train - INFO - Step 71150 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:17:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8179
2025-08-30 11:17:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.12e-05
2025-08-30 11:17:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:18:09 - pico-train - INFO - Step 71175 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:18:09 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8349
2025-08-30 11:18:09 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.12e-05
2025-08-30 11:18:09 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:18:22 - pico-train - INFO - Step 71200 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:18:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7446
2025-08-30 11:18:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.11e-05
2025-08-30 11:18:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:18:35 - pico-train - INFO - Step 71225 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:18:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8961
2025-08-30 11:18:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.11e-05
2025-08-30 11:18:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:18:48 - pico-train - INFO - Step 71250 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:18:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7719
2025-08-30 11:18:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.11e-05
2025-08-30 11:18:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:19:01 - pico-train - INFO - Step 71275 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:19:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7171
2025-08-30 11:19:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.11e-05
2025-08-30 11:19:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:19:13 - pico-train - INFO - Step 71300 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:19:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7381
2025-08-30 11:19:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.11e-05
2025-08-30 11:19:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:19:26 - pico-train - INFO - Step 71325 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:19:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7906
2025-08-30 11:19:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.11e-05
2025-08-30 11:19:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:19:39 - pico-train - INFO - Step 71350 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:19:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9247
2025-08-30 11:19:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.10e-05
2025-08-30 11:19:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:19:51 - pico-train - INFO - Step 71375 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:19:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8136
2025-08-30 11:19:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.10e-05
2025-08-30 11:19:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:20:04 - pico-train - INFO - Step 71400 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:20:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7196
2025-08-30 11:20:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.10e-05
2025-08-30 11:20:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:20:16 - pico-train - INFO - Step 71425 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:20:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7807
2025-08-30 11:20:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.10e-05
2025-08-30 11:20:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:20:29 - pico-train - INFO - Step 71450 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:20:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8609
2025-08-30 11:20:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.10e-05
2025-08-30 11:20:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:20:42 - pico-train - INFO - Step 71475 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:20:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7683
2025-08-30 11:20:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.10e-05
2025-08-30 11:20:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:20:54 - pico-train - INFO - Step 71500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 11:22:50 - pico-train - INFO - Step 71500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 11:22:50 - pico-train - INFO - โ””โ”€โ”€ paloma: 6.282048257805562e+31
2025-08-30 11:22:53 - pico-train - INFO - Step 71500 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:22:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8034
2025-08-30 11:22:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.09e-05
2025-08-30 11:22:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:22:53 - pico-train - INFO - Step 71500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 11:23:08 - pico-train - INFO - Step 71525 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:23:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7923
2025-08-30 11:23:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.09e-05
2025-08-30 11:23:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:23:21 - pico-train - INFO - Step 71550 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:23:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8365
2025-08-30 11:23:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.09e-05
2025-08-30 11:23:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:23:34 - pico-train - INFO - Step 71575 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:23:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7924
2025-08-30 11:23:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.09e-05
2025-08-30 11:23:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:23:46 - pico-train - INFO - Step 71600 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:23:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8132
2025-08-30 11:23:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.09e-05
2025-08-30 11:23:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:23:59 - pico-train - INFO - Step 71625 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:23:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8109
2025-08-30 11:23:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.08e-05
2025-08-30 11:23:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:24:11 - pico-train - INFO - Step 71650 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:24:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8357
2025-08-30 11:24:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.08e-05
2025-08-30 11:24:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:24:24 - pico-train - INFO - Step 71675 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:24:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8117
2025-08-30 11:24:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.08e-05
2025-08-30 11:24:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:24:37 - pico-train - INFO - Step 71700 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:24:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6849
2025-08-30 11:24:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.08e-05
2025-08-30 11:24:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:24:50 - pico-train - INFO - Step 71725 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:24:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8316
2025-08-30 11:24:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.08e-05
2025-08-30 11:24:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:25:03 - pico-train - INFO - Step 71750 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:25:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8852
2025-08-30 11:25:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.08e-05
2025-08-30 11:25:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:25:15 - pico-train - INFO - Step 71775 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:25:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7825
2025-08-30 11:25:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.07e-05
2025-08-30 11:25:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:25:28 - pico-train - INFO - Step 71800 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:25:28 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8405
2025-08-30 11:25:28 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.07e-05
2025-08-30 11:25:28 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:25:41 - pico-train - INFO - Step 71825 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:25:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7973
2025-08-30 11:25:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.07e-05
2025-08-30 11:25:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:25:53 - pico-train - INFO - Step 71850 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:25:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8016
2025-08-30 11:25:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.07e-05
2025-08-30 11:25:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:26:06 - pico-train - INFO - Step 71875 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:26:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6851
2025-08-30 11:26:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.07e-05
2025-08-30 11:26:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:26:19 - pico-train - INFO - Step 71900 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:26:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7568
2025-08-30 11:26:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.07e-05
2025-08-30 11:26:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:26:31 - pico-train - INFO - Step 71925 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:26:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7542
2025-08-30 11:26:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.06e-05
2025-08-30 11:26:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:26:44 - pico-train - INFO - Step 71950 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:26:44 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6807
2025-08-30 11:26:44 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.06e-05
2025-08-30 11:26:44 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:26:56 - pico-train - INFO - Step 71975 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:26:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7309
2025-08-30 11:26:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.06e-05
2025-08-30 11:26:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:27:09 - pico-train - INFO - Step 72000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 11:29:13 - pico-train - INFO - Step 72000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 11:29:13 - pico-train - INFO - โ””โ”€โ”€ paloma: 6.442465619967253e+31
2025-08-30 11:29:15 - pico-train - INFO - Step 72000 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:29:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7989
2025-08-30 11:29:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.06e-05
2025-08-30 11:29:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:29:15 - pico-train - INFO - Step 72000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 11:29:30 - pico-train - INFO - Step 72025 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:29:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7701
2025-08-30 11:29:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.06e-05
2025-08-30 11:29:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:29:42 - pico-train - INFO - Step 72050 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:29:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7553
2025-08-30 11:29:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.05e-05
2025-08-30 11:29:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:29:55 - pico-train - INFO - Step 72075 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:29:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6550
2025-08-30 11:29:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.05e-05
2025-08-30 11:29:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:30:08 - pico-train - INFO - Step 72100 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:30:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7120
2025-08-30 11:30:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.05e-05
2025-08-30 11:30:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:30:20 - pico-train - INFO - Step 72125 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:30:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8457
2025-08-30 11:30:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.05e-05
2025-08-30 11:30:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:30:33 - pico-train - INFO - Step 72150 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:30:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7710
2025-08-30 11:30:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.05e-05
2025-08-30 11:30:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:30:46 - pico-train - INFO - Step 72175 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:30:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8311
2025-08-30 11:30:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.05e-05
2025-08-30 11:30:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:30:58 - pico-train - INFO - Step 72200 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:30:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8419
2025-08-30 11:30:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.04e-05
2025-08-30 11:30:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:31:12 - pico-train - INFO - Step 72225 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:31:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7954
2025-08-30 11:31:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.04e-05
2025-08-30 11:31:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:31:24 - pico-train - INFO - Step 72250 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:31:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7894
2025-08-30 11:31:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.04e-05
2025-08-30 11:31:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:31:37 - pico-train - INFO - Step 72275 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:31:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7746
2025-08-30 11:31:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.04e-05
2025-08-30 11:31:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:31:50 - pico-train - INFO - Step 72300 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:31:50 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.9178
2025-08-30 11:31:50 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.04e-05
2025-08-30 11:31:50 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:32:02 - pico-train - INFO - Step 72325 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:32:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8326
2025-08-30 11:32:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.04e-05
2025-08-30 11:32:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:32:15 - pico-train - INFO - Step 72350 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:32:15 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8099
2025-08-30 11:32:15 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.03e-05
2025-08-30 11:32:15 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:32:27 - pico-train - INFO - Step 72375 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:32:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7497
2025-08-30 11:32:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.03e-05
2025-08-30 11:32:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:32:40 - pico-train - INFO - Step 72400 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:32:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7700
2025-08-30 11:32:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.03e-05
2025-08-30 11:32:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:32:53 - pico-train - INFO - Step 72425 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:32:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8295
2025-08-30 11:32:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.03e-05
2025-08-30 11:32:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:33:06 - pico-train - INFO - Step 72450 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:33:06 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7635
2025-08-30 11:33:06 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.03e-05
2025-08-30 11:33:06 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:33:18 - pico-train - INFO - Step 72475 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:33:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7644
2025-08-30 11:33:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.03e-05
2025-08-30 11:33:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:33:31 - pico-train - INFO - Step 72500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 11:35:40 - pico-train - INFO - Step 72500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 11:35:40 - pico-train - INFO - โ””โ”€โ”€ paloma: 7.433151564209409e+31
2025-08-30 11:35:42 - pico-train - INFO - Step 72500 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:35:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7986
2025-08-30 11:35:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.02e-05
2025-08-30 11:35:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:35:42 - pico-train - INFO - Step 72500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 11:35:57 - pico-train - INFO - Step 72525 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:35:57 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8320
2025-08-30 11:35:57 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.02e-05
2025-08-30 11:35:57 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:36:10 - pico-train - INFO - Step 72550 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:36:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7602
2025-08-30 11:36:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.02e-05
2025-08-30 11:36:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:36:22 - pico-train - INFO - Step 72575 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:36:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7627
2025-08-30 11:36:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.02e-05
2025-08-30 11:36:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:36:35 - pico-train - INFO - Step 72600 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:36:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7779
2025-08-30 11:36:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.02e-05
2025-08-30 11:36:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:36:48 - pico-train - INFO - Step 72625 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:36:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8076
2025-08-30 11:36:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.02e-05
2025-08-30 11:36:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:37:00 - pico-train - INFO - Step 72650 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:37:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8050
2025-08-30 11:37:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.01e-05
2025-08-30 11:37:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:37:13 - pico-train - INFO - Step 72675 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:37:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8470
2025-08-30 11:37:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.01e-05
2025-08-30 11:37:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:37:26 - pico-train - INFO - Step 72700 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:37:26 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7896
2025-08-30 11:37:26 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.01e-05
2025-08-30 11:37:26 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:37:39 - pico-train - INFO - Step 72725 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:37:39 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7821
2025-08-30 11:37:39 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.01e-05
2025-08-30 11:37:39 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:37:52 - pico-train - INFO - Step 72750 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:37:52 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7733
2025-08-30 11:37:52 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.01e-05
2025-08-30 11:37:52 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:38:04 - pico-train - INFO - Step 72775 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:38:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8627
2025-08-30 11:38:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.00e-05
2025-08-30 11:38:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:38:17 - pico-train - INFO - Step 72800 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:38:17 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8219
2025-08-30 11:38:17 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.00e-05
2025-08-30 11:38:17 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:38:30 - pico-train - INFO - Step 72825 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:38:30 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8448
2025-08-30 11:38:30 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.00e-05
2025-08-30 11:38:30 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:38:42 - pico-train - INFO - Step 72850 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:38:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7459
2025-08-30 11:38:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 1.00e-05
2025-08-30 11:38:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:38:55 - pico-train - INFO - Step 72875 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:38:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8400
2025-08-30 11:38:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.98e-06
2025-08-30 11:38:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:39:07 - pico-train - INFO - Step 72900 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:39:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7810
2025-08-30 11:39:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.96e-06
2025-08-30 11:39:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:39:20 - pico-train - INFO - Step 72925 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:39:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8001
2025-08-30 11:39:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.95e-06
2025-08-30 11:39:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:39:33 - pico-train - INFO - Step 72950 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:39:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8616
2025-08-30 11:39:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.93e-06
2025-08-30 11:39:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:39:45 - pico-train - INFO - Step 72975 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:39:45 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8884
2025-08-30 11:39:45 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.91e-06
2025-08-30 11:39:45 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:39:57 - pico-train - INFO - Step 73000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 11:41:54 - pico-train - INFO - Step 73000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 11:41:54 - pico-train - INFO - โ””โ”€โ”€ paloma: 8.156828131388013e+31
2025-08-30 11:41:56 - pico-train - INFO - Step 73000 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:41:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7843
2025-08-30 11:41:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.89e-06
2025-08-30 11:41:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:41:56 - pico-train - INFO - Step 73000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 11:42:11 - pico-train - INFO - Step 73025 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:42:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7129
2025-08-30 11:42:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.88e-06
2025-08-30 11:42:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:42:24 - pico-train - INFO - Step 73050 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:42:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8605
2025-08-30 11:42:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.86e-06
2025-08-30 11:42:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:42:37 - pico-train - INFO - Step 73075 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:42:37 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8538
2025-08-30 11:42:37 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.84e-06
2025-08-30 11:42:37 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:42:49 - pico-train - INFO - Step 73100 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:42:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8061
2025-08-30 11:42:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.83e-06
2025-08-30 11:42:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:43:02 - pico-train - INFO - Step 73125 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:43:02 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6467
2025-08-30 11:43:02 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.81e-06
2025-08-30 11:43:02 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:43:14 - pico-train - INFO - Step 73150 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:43:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7946
2025-08-30 11:43:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.79e-06
2025-08-30 11:43:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:43:27 - pico-train - INFO - Step 73175 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:43:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7591
2025-08-30 11:43:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.78e-06
2025-08-30 11:43:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:43:40 - pico-train - INFO - Step 73200 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:43:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7435
2025-08-30 11:43:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.76e-06
2025-08-30 11:43:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:43:55 - pico-train - INFO - Step 73225 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:43:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7541
2025-08-30 11:43:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.74e-06
2025-08-30 11:43:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:44:08 - pico-train - INFO - Step 73250 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:44:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8107
2025-08-30 11:44:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.72e-06
2025-08-30 11:44:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:44:20 - pico-train - INFO - Step 73275 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:44:20 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7636
2025-08-30 11:44:20 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.71e-06
2025-08-30 11:44:20 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:44:33 - pico-train - INFO - Step 73300 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:44:33 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7746
2025-08-30 11:44:33 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.69e-06
2025-08-30 11:44:33 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:44:46 - pico-train - INFO - Step 73325 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:44:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8366
2025-08-30 11:44:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.67e-06
2025-08-30 11:44:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:44:58 - pico-train - INFO - Step 73350 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:44:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8148
2025-08-30 11:44:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.66e-06
2025-08-30 11:44:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:45:11 - pico-train - INFO - Step 73375 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:45:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8216
2025-08-30 11:45:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.64e-06
2025-08-30 11:45:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:45:24 - pico-train - INFO - Step 73400 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:45:24 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8380
2025-08-30 11:45:24 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.62e-06
2025-08-30 11:45:24 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:45:36 - pico-train - INFO - Step 73425 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:45:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7821
2025-08-30 11:45:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.61e-06
2025-08-30 11:45:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:45:49 - pico-train - INFO - Step 73450 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:45:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7886
2025-08-30 11:45:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.59e-06
2025-08-30 11:45:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:46:01 - pico-train - INFO - Step 73475 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:46:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7748
2025-08-30 11:46:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.57e-06
2025-08-30 11:46:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:46:13 - pico-train - INFO - Step 73500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 11:48:25 - pico-train - INFO - Step 73500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 11:48:25 - pico-train - INFO - โ””โ”€โ”€ paloma: 9.704589730778985e+31
2025-08-30 11:48:27 - pico-train - INFO - Step 73500 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:48:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6891
2025-08-30 11:48:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.56e-06
2025-08-30 11:48:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:48:27 - pico-train - INFO - Step 73500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 11:48:43 - pico-train - INFO - Step 73525 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:48:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7314
2025-08-30 11:48:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.54e-06
2025-08-30 11:48:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:48:56 - pico-train - INFO - Step 73550 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:48:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6260
2025-08-30 11:48:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.52e-06
2025-08-30 11:48:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:49:08 - pico-train - INFO - Step 73575 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:49:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7770
2025-08-30 11:49:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.51e-06
2025-08-30 11:49:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:49:21 - pico-train - INFO - Step 73600 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:49:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7806
2025-08-30 11:49:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.49e-06
2025-08-30 11:49:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:49:34 - pico-train - INFO - Step 73625 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:49:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7744
2025-08-30 11:49:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.47e-06
2025-08-30 11:49:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:49:46 - pico-train - INFO - Step 73650 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:49:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7460
2025-08-30 11:49:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.46e-06
2025-08-30 11:49:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:49:59 - pico-train - INFO - Step 73675 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:49:59 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8272
2025-08-30 11:49:59 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.44e-06
2025-08-30 11:49:59 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:50:12 - pico-train - INFO - Step 73700 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:50:12 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7866
2025-08-30 11:50:12 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.42e-06
2025-08-30 11:50:12 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:50:25 - pico-train - INFO - Step 73725 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:50:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7838
2025-08-30 11:50:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.41e-06
2025-08-30 11:50:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:50:38 - pico-train - INFO - Step 73750 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:50:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6949
2025-08-30 11:50:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.39e-06
2025-08-30 11:50:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:50:51 - pico-train - INFO - Step 73775 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:50:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7301
2025-08-30 11:50:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.37e-06
2025-08-30 11:50:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:51:03 - pico-train - INFO - Step 73800 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:51:03 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7987
2025-08-30 11:51:03 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.36e-06
2025-08-30 11:51:03 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:51:16 - pico-train - INFO - Step 73825 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:51:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8495
2025-08-30 11:51:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.34e-06
2025-08-30 11:51:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:51:29 - pico-train - INFO - Step 73850 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:51:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7411
2025-08-30 11:51:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.32e-06
2025-08-30 11:51:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:51:41 - pico-train - INFO - Step 73875 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:51:41 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7792
2025-08-30 11:51:41 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.31e-06
2025-08-30 11:51:41 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:51:54 - pico-train - INFO - Step 73900 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:51:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8225
2025-08-30 11:51:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.29e-06
2025-08-30 11:51:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:52:07 - pico-train - INFO - Step 73925 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:52:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7823
2025-08-30 11:52:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.27e-06
2025-08-30 11:52:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:52:19 - pico-train - INFO - Step 73950 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:52:19 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.6970
2025-08-30 11:52:19 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.26e-06
2025-08-30 11:52:19 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:52:32 - pico-train - INFO - Step 73975 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:52:32 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7531
2025-08-30 11:52:32 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.24e-06
2025-08-30 11:52:32 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:52:45 - pico-train - INFO - Step 74000 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 11:54:39 - pico-train - INFO - Step 74000 -- ๐Ÿ“Š Evaluation Results
2025-08-30 11:54:39 - pico-train - INFO - โ””โ”€โ”€ paloma: 8.636477783625786e+31
2025-08-30 11:54:42 - pico-train - INFO - Step 74000 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:54:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7592
2025-08-30 11:54:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.22e-06
2025-08-30 11:54:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:54:42 - pico-train - INFO - Step 74000 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 11:54:58 - pico-train - INFO - Step 74025 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:54:58 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7057
2025-08-30 11:54:58 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.21e-06
2025-08-30 11:54:58 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:55:11 - pico-train - INFO - Step 74050 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:55:11 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8112
2025-08-30 11:55:11 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.19e-06
2025-08-30 11:55:11 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:55:23 - pico-train - INFO - Step 74075 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:55:23 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8551
2025-08-30 11:55:23 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.17e-06
2025-08-30 11:55:23 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:55:36 - pico-train - INFO - Step 74100 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:55:36 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7881
2025-08-30 11:55:36 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.16e-06
2025-08-30 11:55:36 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:55:49 - pico-train - INFO - Step 74125 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:55:49 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7239
2025-08-30 11:55:49 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.14e-06
2025-08-30 11:55:49 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:56:01 - pico-train - INFO - Step 74150 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:56:01 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7491
2025-08-30 11:56:01 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.12e-06
2025-08-30 11:56:01 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:56:14 - pico-train - INFO - Step 74175 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:56:14 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7418
2025-08-30 11:56:14 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.11e-06
2025-08-30 11:56:14 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:56:27 - pico-train - INFO - Step 74200 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:56:27 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8195
2025-08-30 11:56:27 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.09e-06
2025-08-30 11:56:27 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:56:40 - pico-train - INFO - Step 74225 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:56:40 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8008
2025-08-30 11:56:40 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.07e-06
2025-08-30 11:56:40 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:56:53 - pico-train - INFO - Step 74250 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:56:53 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7900
2025-08-30 11:56:53 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.06e-06
2025-08-30 11:56:53 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:57:05 - pico-train - INFO - Step 74275 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:57:05 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8471
2025-08-30 11:57:05 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.04e-06
2025-08-30 11:57:05 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:57:18 - pico-train - INFO - Step 74300 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:57:18 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8221
2025-08-30 11:57:18 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.02e-06
2025-08-30 11:57:18 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:57:31 - pico-train - INFO - Step 74325 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:57:31 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7390
2025-08-30 11:57:31 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 9.01e-06
2025-08-30 11:57:31 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:57:43 - pico-train - INFO - Step 74350 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:57:43 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7864
2025-08-30 11:57:43 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.99e-06
2025-08-30 11:57:43 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:57:56 - pico-train - INFO - Step 74375 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:57:56 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8961
2025-08-30 11:57:56 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.98e-06
2025-08-30 11:57:56 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:58:08 - pico-train - INFO - Step 74400 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:58:08 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7558
2025-08-30 11:58:08 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.96e-06
2025-08-30 11:58:08 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:58:21 - pico-train - INFO - Step 74425 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:58:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7641
2025-08-30 11:58:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.94e-06
2025-08-30 11:58:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:58:34 - pico-train - INFO - Step 74450 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:58:34 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7386
2025-08-30 11:58:34 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.93e-06
2025-08-30 11:58:34 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:58:46 - pico-train - INFO - Step 74475 -- ๐Ÿ”„ Training Metrics
2025-08-30 11:58:46 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7682
2025-08-30 11:58:46 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.91e-06
2025-08-30 11:58:46 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 11:58:59 - pico-train - INFO - Step 74500 -- ๐Ÿ’พ Saving Checkpoint
2025-08-30 12:00:53 - pico-train - INFO - Step 74500 -- ๐Ÿ“Š Evaluation Results
2025-08-30 12:00:53 - pico-train - INFO - โ””โ”€โ”€ paloma: 9.875388203359053e+31
2025-08-30 12:00:55 - pico-train - INFO - Step 74500 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:00:55 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7399
2025-08-30 12:00:55 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.89e-06
2025-08-30 12:00:55 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:00:55 - pico-train - INFO - Step 74500 -- ๐Ÿ“ˆ Saving Learning Dynamics
2025-08-30 12:01:10 - pico-train - INFO - Step 74525 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:01:10 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7499
2025-08-30 12:01:10 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.88e-06
2025-08-30 12:01:10 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:01:22 - pico-train - INFO - Step 74550 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:01:22 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8008
2025-08-30 12:01:22 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.86e-06
2025-08-30 12:01:22 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:01:35 - pico-train - INFO - Step 74575 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:01:35 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8048
2025-08-30 12:01:35 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.85e-06
2025-08-30 12:01:35 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:01:48 - pico-train - INFO - Step 74600 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:01:48 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7352
2025-08-30 12:01:48 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.83e-06
2025-08-30 12:01:48 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:02:00 - pico-train - INFO - Step 74625 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:02:00 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7900
2025-08-30 12:02:00 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.81e-06
2025-08-30 12:02:00 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:02:13 - pico-train - INFO - Step 74650 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:02:13 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8181
2025-08-30 12:02:13 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.80e-06
2025-08-30 12:02:13 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:02:25 - pico-train - INFO - Step 74675 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:02:25 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8068
2025-08-30 12:02:25 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.78e-06
2025-08-30 12:02:25 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:02:38 - pico-train - INFO - Step 74700 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:02:38 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7906
2025-08-30 12:02:38 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.76e-06
2025-08-30 12:02:38 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:02:51 - pico-train - INFO - Step 74725 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:02:51 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7719
2025-08-30 12:02:51 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.75e-06
2025-08-30 12:02:51 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:03:04 - pico-train - INFO - Step 74750 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:03:04 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7901
2025-08-30 12:03:04 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.73e-06
2025-08-30 12:03:04 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:03:16 - pico-train - INFO - Step 74775 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:03:16 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7765
2025-08-30 12:03:16 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.72e-06
2025-08-30 12:03:16 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:03:29 - pico-train - INFO - Step 74800 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:03:29 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7052
2025-08-30 12:03:29 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.70e-06
2025-08-30 12:03:29 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:03:42 - pico-train - INFO - Step 74825 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:03:42 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7863
2025-08-30 12:03:42 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.68e-06
2025-08-30 12:03:42 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:03:54 - pico-train - INFO - Step 74850 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:03:54 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7816
2025-08-30 12:03:54 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.67e-06
2025-08-30 12:03:54 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:04:07 - pico-train - INFO - Step 74875 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:04:07 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.7777
2025-08-30 12:04:07 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.65e-06
2025-08-30 12:04:07 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0
2025-08-30 12:04:21 - pico-train - INFO - Step 74900 -- ๐Ÿ”„ Training Metrics
2025-08-30 12:04:21 - pico-train - INFO - โ”œโ”€โ”€ Loss: 5.8692
2025-08-30 12:04:21 - pico-train - INFO - โ”œโ”€โ”€ Learning Rate: 8.63e-06
2025-08-30 12:04:21 - pico-train - INFO - โ””โ”€โ”€ Inf/NaN count: 0