yuccaaa commited on
Commit
248abcb
·
verified ·
1 Parent(s): 164c4bf

Upload ProtT3/all_checkpoints/stage2_07301646_2datasets_construct/wandb/run-20250730_175623-pbf2bxo6/logs/debug.log with huggingface_hub

Browse files
ProtT3/all_checkpoints/stage2_07301646_2datasets_construct/wandb/run-20250730_175623-pbf2bxo6/logs/debug.log ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-07-30 17:56:23,983 INFO MainThread:2783937 [wandb_setup.py:_flush():70] Current SDK version is 0.19.11
2
+ 2025-07-30 17:56:23,983 INFO MainThread:2783937 [wandb_setup.py:_flush():70] Configure stats pid to 2783937
3
+ 2025-07-30 17:56:23,983 INFO MainThread:2783937 [wandb_setup.py:_flush():70] Loading settings from /root/.config/wandb/settings
4
+ 2025-07-30 17:56:23,983 INFO MainThread:2783937 [wandb_setup.py:_flush():70] Loading settings from /nas/shared/kilab/wangyujia/ProtT3/wandb/settings
5
+ 2025-07-30 17:56:23,983 INFO MainThread:2783937 [wandb_setup.py:_flush():70] Loading settings from environment variables
6
+ 2025-07-30 17:56:23,984 INFO MainThread:2783937 [wandb_init.py:setup_run_log_directory():724] Logging user logs to ./all_checkpoints/stage2_07301646_2datasets_construct/wandb/run-20250730_175623-pbf2bxo6/logs/debug.log
7
+ 2025-07-30 17:56:23,984 INFO MainThread:2783937 [wandb_init.py:setup_run_log_directory():725] Logging internal logs to ./all_checkpoints/stage2_07301646_2datasets_construct/wandb/run-20250730_175623-pbf2bxo6/logs/debug-internal.log
8
+ 2025-07-30 17:56:23,984 INFO MainThread:2783937 [wandb_init.py:init():852] calling init triggers
9
+ 2025-07-30 17:56:23,984 INFO MainThread:2783937 [wandb_init.py:init():857] wandb.init called with sweep_config: {}
10
+ config: {'_wandb': {}}
11
+ 2025-07-30 17:56:23,984 INFO MainThread:2783937 [wandb_init.py:init():893] starting backend
12
+ 2025-07-30 17:56:23,984 INFO MainThread:2783937 [wandb_init.py:init():897] sending inform_init request
13
+ 2025-07-30 17:56:23,985 INFO MainThread:2783937 [backend.py:_multiprocessing_setup():101] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
14
+ 2025-07-30 17:56:23,985 INFO MainThread:2783937 [wandb_init.py:init():907] backend started and connected
15
+ 2025-07-30 17:56:23,987 INFO MainThread:2783937 [wandb_init.py:init():1005] updated telemetry
16
+ 2025-07-30 17:56:24,138 INFO MainThread:2783937 [wandb_init.py:init():1029] communicating run to backend with 90.0 second timeout
17
+ 2025-07-30 17:56:27,565 INFO MainThread:2783937 [wandb_init.py:init():1104] starting run threads in backend
18
+ 2025-07-30 17:56:27,773 INFO MainThread:2783937 [wandb_run.py:_console_start():2573] atexit reg
19
+ 2025-07-30 17:56:27,773 INFO MainThread:2783937 [wandb_run.py:_redirect():2421] redirect: wrap_raw
20
+ 2025-07-30 17:56:27,774 INFO MainThread:2783937 [wandb_run.py:_redirect():2490] Wrapping output streams.
21
+ 2025-07-30 17:56:27,774 INFO MainThread:2783937 [wandb_run.py:_redirect():2513] Redirects installed.
22
+ 2025-07-30 17:56:27,775 INFO MainThread:2783937 [wandb_init.py:init():1150] run started, returning control to user process
23
+ 2025-07-30 17:56:36,656 INFO MainThread:2783937 [wandb_run.py:_config_callback():1436] config_cb None None {'filename': 'stage2_07301646_2datasets_construct', 'seed': 42, 'mode': 'train', 'strategy': 'deepspeed', 'accelerator': 'gpu', 'devices': '0,1,2,3,4,5,6,7', 'precision': 'bf16-mixed', 'max_epochs': 10, 'accumulate_grad_batches': 1, 'check_val_every_n_epoch': 1, 'enable_flash': False, 'use_wandb_logger': True, 'mix_dataset': False, 'dataset': 'swiss-prot', 'save_every_n_epochs': 2, 'bert_name': '/nas/shared/kilab/wangyujia/ProtT3/plm_model/microsoft', 'cross_attention_freq': 2, 'num_query_token': 8, 'qformer_tune': 'train', 'llm_name': '/oss/wangyujia/BIO/construction_finetuning/alpaca/v1-20250609-141541/checkpoint-50-merged', 'num_beams': 5, 'do_sample': False, 'max_inference_len': 128, 'min_inference_len': 1, 'llm_tune': 'mid_lora', 'peft_config': '', 'peft_dir': '', 'plm_model': '/nas/shared/kilab/wangyujia/ProtT3/plm_model/esm2-150m', 'plm_tune': 'freeze', 'lora_r': 8, 'lora_alpha': 16, 'lora_dropout': 0.1, 'enbale_gradient_checkpointing': False, 'weight_decay': 0.05, 'init_lr': 0.0001, 'min_lr': 1e-05, 'warmup_lr': 1e-06, 'warmup_steps': 1000, 'lr_decay_rate': 0.9, 'scheduler': 'linear_warmup_cosine_lr', 'stage1_path': '/nas/shared/kilab/wangyujia/ProtT3/all_checkpoints/stage1_07041727_2dataset/epoch=29.ckpt/converted.ckpt', 'stage2_path': '', 'init_checkpoint': '/nas/shared/kilab/wangyujia/ProtT3/all_checkpoints/stage2_07070513_2datasets_construct/epoch=09.ckpt/converted.ckpt', 'caption_eval_epoch': 5, 'num_workers': 8, 'batch_size': 4, 'inference_batch_size': 4, 'root': 'data', 'text_max_len': 2048, 'q_max_len': 29, 'a_max_len': 36, 'prot_max_len': 1024, 'prompt': 'The protein has the following properties:', 'filter_side_qa': False}
24
+ 2025-07-31 10:58:52,637 INFO MsgRouterThr:2783937 [mailbox.py:close():129] [no run ID] Closing mailbox, abandoning 1 handles.