| 2025-05-11 17:37:01,876 INFO MainThread:1306315 [wandb_setup.py:_flush():68] Current SDK version is 0.19.10 | |
| 2025-05-11 17:37:01,876 INFO MainThread:1306315 [wandb_setup.py:_flush():68] Configure stats pid to 1306315 | |
| 2025-05-11 17:37:01,876 INFO MainThread:1306315 [wandb_setup.py:_flush():68] Loading settings from /home/panda/.config/wandb/settings | |
| 2025-05-11 17:37:01,876 INFO MainThread:1306315 [wandb_setup.py:_flush():68] Loading settings from /home/panda/pda-llm/scripts/wandb/settings | |
| 2025-05-11 17:37:01,876 INFO MainThread:1306315 [wandb_setup.py:_flush():68] Loading settings from environment variables | |
| 2025-05-11 17:37:01,876 INFO MainThread:1306315 [wandb_init.py:setup_run_log_directory():724] Logging user logs to /home/panda/pda-llm/output/sft-tools/run-false-1-100-16-4096/wandb/run-20250511_173701-jpa5uws1/logs/debug.log | |
| 2025-05-11 17:37:01,876 INFO MainThread:1306315 [wandb_init.py:setup_run_log_directory():725] Logging internal logs to /home/panda/pda-llm/output/sft-tools/run-false-1-100-16-4096/wandb/run-20250511_173701-jpa5uws1/logs/debug-internal.log | |
| 2025-05-11 17:37:01,876 INFO MainThread:1306315 [wandb_init.py:init():852] calling init triggers | |
| 2025-05-11 17:37:01,876 INFO MainThread:1306315 [wandb_init.py:init():857] wandb.init called with sweep_config: {} | |
| config: {'model_name_or_path': 'meta-llama/Llama-3.1-8B-Instruct', 'recompute_baseline': False, 'cache_dir': '/home/panda/pda-llm/cache/sft-tools', 'max_length': 4096, 'trust_remote_code': True, 'train_datasets': [('tools', {'proportion': 1.0})], 'eval_datasets': None, 'safety_ratio_tol': 100.0, 'important_sft': False, 'resilient_coeff': 1.0, 'epochs': 3, 'per_device_train_batch_size': 1, 'per_device_eval_batch_size': 1, 'gradient_accumulation_steps': 48, 'gradient_checkpointing': True, 'lr': 0.0001, 'lr_scheduler_type': <SchedulerType.COSINE: 'cosine'>, 'lr_warmup_ratio': 0.1, 'weight_decay': 0.0, 'seed': 42, 'fp16': False, 'bf16': True, 'tf32': False, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'eval_strategy': 'epoch', 'eval_interval': 1000000, 'need_eval': True, 'eval_split_ratio': None, 'output_dir': '/home/panda/pda-llm/output/sft-tools/run-false-1-100-16-4096', 'log_type': 'wandb', 'log_dir': '/home/panda/pda-llm/output/sft-tools/run-false-1-100-16-4096', 'log_project': 'TOOLS-SFT', 'log_run_name': 'tools-sft-2025-05-11-17-37-01', 'save_16bit': False, 'save_interval': 1000000, 'local_rank': 0, 'zero_stage': 0, 'offload': 'none', 'deepspeed': False, 'deepspeed_config': None, 'deepscale': False, 'deepscale_config': None, 'global_rank': 0, 'device': device(type='cuda', index=0), 'num_update_steps_per_epoch': 112, 'total_training_steps': 336, '_wandb': {}} | |
| 2025-05-11 17:37:01,876 INFO MainThread:1306315 [wandb_init.py:init():893] starting backend | |
| 2025-05-11 17:37:01,876 INFO MainThread:1306315 [wandb_init.py:init():897] sending inform_init request | |
| 2025-05-11 17:37:01,878 INFO MainThread:1306315 [backend.py:_multiprocessing_setup():101] multiprocessing start_methods=fork,spawn,forkserver, using: spawn | |
| 2025-05-11 17:37:01,878 INFO MainThread:1306315 [wandb_init.py:init():907] backend started and connected | |
| 2025-05-11 17:37:01,880 INFO MainThread:1306315 [wandb_init.py:init():1002] updated telemetry | |
| 2025-05-11 17:37:01,886 INFO MainThread:1306315 [wandb_init.py:init():1026] communicating run to backend with 90.0 second timeout | |
| 2025-05-11 17:37:02,249 INFO MainThread:1306315 [wandb_init.py:init():1101] starting run threads in backend | |
| 2025-05-11 17:37:02,317 INFO MainThread:1306315 [wandb_run.py:_console_start():2566] atexit reg | |
| 2025-05-11 17:37:02,317 INFO MainThread:1306315 [wandb_run.py:_redirect():2414] redirect: wrap_raw | |
| 2025-05-11 17:37:02,317 INFO MainThread:1306315 [wandb_run.py:_redirect():2483] Wrapping output streams. | |
| 2025-05-11 17:37:02,317 INFO MainThread:1306315 [wandb_run.py:_redirect():2506] Redirects installed. | |
| 2025-05-11 17:37:02,319 INFO MainThread:1306315 [wandb_init.py:init():1147] run started, returning control to user process | |
| 2025-05-11 20:33:54,991 INFO MainThread:1306315 [wandb_run.py:_finish():2314] finishing run alelab/TOOLS-SFT/jpa5uws1 | |
| 2025-05-11 20:33:54,992 INFO MainThread:1306315 [wandb_run.py:_atexit_cleanup():2531] got exitcode: 0 | |
| 2025-05-11 20:33:54,993 INFO MainThread:1306315 [wandb_run.py:_restore():2513] restore | |
| 2025-05-11 20:33:54,993 INFO MainThread:1306315 [wandb_run.py:_restore():2519] restore done | |
| 2025-05-11 20:33:55,703 INFO MainThread:1306315 [wandb_run.py:_footer_history_summary_info():4160] rendering history | |
| 2025-05-11 20:33:55,706 INFO MainThread:1306315 [wandb_run.py:_footer_history_summary_info():4192] rendering summary | |
| 2025-05-11 20:33:55,712 INFO MainThread:1306315 [wandb_run.py:_footer_sync_info():4121] logging synced files | |