Upload ProtT3/all_checkpoints/stage2_07301646_2datasets_construct/wandb/run-20250730_175623-pbf2bxo6/logs/debug.log with huggingface_hub
Browse files
ProtT3/all_checkpoints/stage2_07301646_2datasets_construct/wandb/run-20250730_175623-pbf2bxo6/logs/debug.log
ADDED
|
@@ -0,0 +1,24 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
2025-07-30 17:56:23,983 INFO MainThread:2783937 [wandb_setup.py:_flush():70] Current SDK version is 0.19.11
|
| 2 |
+
2025-07-30 17:56:23,983 INFO MainThread:2783937 [wandb_setup.py:_flush():70] Configure stats pid to 2783937
|
| 3 |
+
2025-07-30 17:56:23,983 INFO MainThread:2783937 [wandb_setup.py:_flush():70] Loading settings from /root/.config/wandb/settings
|
| 4 |
+
2025-07-30 17:56:23,983 INFO MainThread:2783937 [wandb_setup.py:_flush():70] Loading settings from /nas/shared/kilab/wangyujia/ProtT3/wandb/settings
|
| 5 |
+
2025-07-30 17:56:23,983 INFO MainThread:2783937 [wandb_setup.py:_flush():70] Loading settings from environment variables
|
| 6 |
+
2025-07-30 17:56:23,984 INFO MainThread:2783937 [wandb_init.py:setup_run_log_directory():724] Logging user logs to ./all_checkpoints/stage2_07301646_2datasets_construct/wandb/run-20250730_175623-pbf2bxo6/logs/debug.log
|
| 7 |
+
2025-07-30 17:56:23,984 INFO MainThread:2783937 [wandb_init.py:setup_run_log_directory():725] Logging internal logs to ./all_checkpoints/stage2_07301646_2datasets_construct/wandb/run-20250730_175623-pbf2bxo6/logs/debug-internal.log
|
| 8 |
+
2025-07-30 17:56:23,984 INFO MainThread:2783937 [wandb_init.py:init():852] calling init triggers
|
| 9 |
+
2025-07-30 17:56:23,984 INFO MainThread:2783937 [wandb_init.py:init():857] wandb.init called with sweep_config: {}
|
| 10 |
+
config: {'_wandb': {}}
|
| 11 |
+
2025-07-30 17:56:23,984 INFO MainThread:2783937 [wandb_init.py:init():893] starting backend
|
| 12 |
+
2025-07-30 17:56:23,984 INFO MainThread:2783937 [wandb_init.py:init():897] sending inform_init request
|
| 13 |
+
2025-07-30 17:56:23,985 INFO MainThread:2783937 [backend.py:_multiprocessing_setup():101] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
|
| 14 |
+
2025-07-30 17:56:23,985 INFO MainThread:2783937 [wandb_init.py:init():907] backend started and connected
|
| 15 |
+
2025-07-30 17:56:23,987 INFO MainThread:2783937 [wandb_init.py:init():1005] updated telemetry
|
| 16 |
+
2025-07-30 17:56:24,138 INFO MainThread:2783937 [wandb_init.py:init():1029] communicating run to backend with 90.0 second timeout
|
| 17 |
+
2025-07-30 17:56:27,565 INFO MainThread:2783937 [wandb_init.py:init():1104] starting run threads in backend
|
| 18 |
+
2025-07-30 17:56:27,773 INFO MainThread:2783937 [wandb_run.py:_console_start():2573] atexit reg
|
| 19 |
+
2025-07-30 17:56:27,773 INFO MainThread:2783937 [wandb_run.py:_redirect():2421] redirect: wrap_raw
|
| 20 |
+
2025-07-30 17:56:27,774 INFO MainThread:2783937 [wandb_run.py:_redirect():2490] Wrapping output streams.
|
| 21 |
+
2025-07-30 17:56:27,774 INFO MainThread:2783937 [wandb_run.py:_redirect():2513] Redirects installed.
|
| 22 |
+
2025-07-30 17:56:27,775 INFO MainThread:2783937 [wandb_init.py:init():1150] run started, returning control to user process
|
| 23 |
+
2025-07-30 17:56:36,656 INFO MainThread:2783937 [wandb_run.py:_config_callback():1436] config_cb None None {'filename': 'stage2_07301646_2datasets_construct', 'seed': 42, 'mode': 'train', 'strategy': 'deepspeed', 'accelerator': 'gpu', 'devices': '0,1,2,3,4,5,6,7', 'precision': 'bf16-mixed', 'max_epochs': 10, 'accumulate_grad_batches': 1, 'check_val_every_n_epoch': 1, 'enable_flash': False, 'use_wandb_logger': True, 'mix_dataset': False, 'dataset': 'swiss-prot', 'save_every_n_epochs': 2, 'bert_name': '/nas/shared/kilab/wangyujia/ProtT3/plm_model/microsoft', 'cross_attention_freq': 2, 'num_query_token': 8, 'qformer_tune': 'train', 'llm_name': '/oss/wangyujia/BIO/construction_finetuning/alpaca/v1-20250609-141541/checkpoint-50-merged', 'num_beams': 5, 'do_sample': False, 'max_inference_len': 128, 'min_inference_len': 1, 'llm_tune': 'mid_lora', 'peft_config': '', 'peft_dir': '', 'plm_model': '/nas/shared/kilab/wangyujia/ProtT3/plm_model/esm2-150m', 'plm_tune': 'freeze', 'lora_r': 8, 'lora_alpha': 16, 'lora_dropout': 0.1, 'enbale_gradient_checkpointing': False, 'weight_decay': 0.05, 'init_lr': 0.0001, 'min_lr': 1e-05, 'warmup_lr': 1e-06, 'warmup_steps': 1000, 'lr_decay_rate': 0.9, 'scheduler': 'linear_warmup_cosine_lr', 'stage1_path': '/nas/shared/kilab/wangyujia/ProtT3/all_checkpoints/stage1_07041727_2dataset/epoch=29.ckpt/converted.ckpt', 'stage2_path': '', 'init_checkpoint': '/nas/shared/kilab/wangyujia/ProtT3/all_checkpoints/stage2_07070513_2datasets_construct/epoch=09.ckpt/converted.ckpt', 'caption_eval_epoch': 5, 'num_workers': 8, 'batch_size': 4, 'inference_batch_size': 4, 'root': 'data', 'text_max_len': 2048, 'q_max_len': 29, 'a_max_len': 36, 'prot_max_len': 1024, 'prompt': 'The protein has the following properties:', 'filter_side_qa': False}
|
| 24 |
+
2025-07-31 10:58:52,637 INFO MsgRouterThr:2783937 [mailbox.py:close():129] [no run ID] Closing mailbox, abandoning 1 handles.
|