19 / wandb /debug-internal.log
Saeid's picture
Training in progress, epoch 2
7e06de3 verified
2024-07-17 16:22:04,040 INFO StreamThr :2116314 [internal.py:wandb_internal():85] W&B internal server running at pid: 2116314, started at: 2024-07-17 16:22:04.039368
2024-07-17 16:22:04,043 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: status
2024-07-17 16:22:04,046 INFO WriterThread:2116314 [datastore.py:open_for_write():87] open: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/run-wglu07sk.wandb
2024-07-17 16:22:04,050 DEBUG SenderThread:2116314 [sender.py:send():379] send: header
2024-07-17 16:22:04,051 DEBUG SenderThread:2116314 [sender.py:send():379] send: run
2024-07-17 16:22:04,205 INFO SenderThread:2116314 [dir_watcher.py:__init__():211] watching files in: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files
2024-07-17 16:22:04,205 INFO SenderThread:2116314 [sender.py:_start_run_threads():1188] run started: wglu07sk with start time 1721233324.03891
2024-07-17 16:22:04,218 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: check_version
2024-07-17 16:22:04,219 DEBUG SenderThread:2116314 [sender.py:send_request():406] send_request: check_version
2024-07-17 16:22:04,281 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: run_start
2024-07-17 16:22:04,332 DEBUG HandlerThread:2116314 [system_info.py:__init__():26] System info init
2024-07-17 16:22:04,332 DEBUG HandlerThread:2116314 [system_info.py:__init__():41] System info init done
2024-07-17 16:22:04,332 INFO HandlerThread:2116314 [system_monitor.py:start():194] Starting system monitor
2024-07-17 16:22:04,332 INFO SystemMonitor:2116314 [system_monitor.py:_start():158] Starting system asset monitoring threads
2024-07-17 16:22:04,332 INFO HandlerThread:2116314 [system_monitor.py:probe():214] Collecting system info
2024-07-17 16:22:04,333 INFO SystemMonitor:2116314 [interfaces.py:start():188] Started cpu monitoring
2024-07-17 16:22:04,335 INFO SystemMonitor:2116314 [interfaces.py:start():188] Started disk monitoring
2024-07-17 16:22:04,336 INFO SystemMonitor:2116314 [interfaces.py:start():188] Started gpu monitoring
2024-07-17 16:22:04,337 INFO SystemMonitor:2116314 [interfaces.py:start():188] Started memory monitoring
2024-07-17 16:22:04,338 INFO SystemMonitor:2116314 [interfaces.py:start():188] Started network monitoring
2024-07-17 16:22:04,400 DEBUG HandlerThread:2116314 [system_info.py:probe():152] Probing system
2024-07-17 16:22:04,405 DEBUG HandlerThread:2116314 [system_info.py:_probe_git():137] Probing git
2024-07-17 16:22:04,416 DEBUG HandlerThread:2116314 [system_info.py:_probe_git():145] Probing git done
2024-07-17 16:22:04,416 DEBUG HandlerThread:2116314 [system_info.py:probe():200] Probing system done
2024-07-17 16:22:04,416 DEBUG HandlerThread:2116314 [system_monitor.py:probe():223] {'os': 'Linux-5.15.0-101-generic-x86_64-with-glibc2.35', 'python': '3.11.9', 'heartbeatAt': '2024-07-17T16:22:04.400954', 'startedAt': '2024-07-17T16:22:04.032523', 'docker': None, 'cuda': None, 'args': (), 'state': 'running', 'program': '/home/cc/polymorph/fine-tuning/main-lora-train.py', 'codePathLocal': 'main-lora-train.py', 'codePath': 'fine-tuning/main-lora-train.py', 'git': {'remote': 'https://github.com/inference-serving/polymorph.git', 'commit': 'e84189a37f0838a7e4ac1496b2345fe84c6a7683'}, 'email': 's.ghafouri@qub.ac.uk', 'root': '/home/cc/polymorph', 'host': 'gpu', 'username': 'cc', 'executable': '/home/cc/miniconda3/envs/vision/bin/python', 'cpu_count': 24, 'cpu_count_logical': 48, 'cpu_freq': {'current': 2576.3446041666666, 'min': 1000.0, 'max': 3700.0}, 'cpu_freq_per_core': [{'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 1401.746, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}, {'current': 2600.0, 'min': 1000.0, 'max': 3700.0}], 'disk': {'/': {'total': 208.95753479003906, 'used': 157.59302139282227}}, 'gpu': 'Quadro RTX 6000', 'gpu_count': 1, 'gpu_devices': [{'name': 'Quadro RTX 6000', 'memory_total': 25769803776}], 'memory': {'total': 187.4629783630371}}
2024-07-17 16:22:04,417 INFO HandlerThread:2116314 [system_monitor.py:probe():224] Finished collecting system info
2024-07-17 16:22:04,417 INFO HandlerThread:2116314 [system_monitor.py:probe():227] Publishing system info
2024-07-17 16:22:04,417 DEBUG HandlerThread:2116314 [system_info.py:_save_conda():209] Saving list of conda packages installed into the current environment
2024-07-17 16:22:05,208 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_created():271] file/dir created: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/conda-environment.yaml
2024-07-17 16:22:07,942 DEBUG HandlerThread:2116314 [system_info.py:_save_conda():224] Saving conda packages done
2024-07-17 16:22:07,943 INFO HandlerThread:2116314 [system_monitor.py:probe():229] Finished publishing system info
2024-07-17 16:22:07,953 DEBUG SenderThread:2116314 [sender.py:send():379] send: files
2024-07-17 16:22:07,953 INFO SenderThread:2116314 [sender.py:_save_file():1454] saving file wandb-metadata.json with policy now
2024-07-17 16:22:08,106 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: python_packages
2024-07-17 16:22:08,106 DEBUG SenderThread:2116314 [sender.py:send_request():406] send_request: python_packages
2024-07-17 16:22:08,107 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: stop_status
2024-07-17 16:22:08,108 DEBUG SenderThread:2116314 [sender.py:send_request():406] send_request: stop_status
2024-07-17 16:22:08,112 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:08,159 DEBUG SenderThread:2116314 [sender.py:send():379] send: telemetry
2024-07-17 16:22:08,159 DEBUG SenderThread:2116314 [sender.py:send():379] send: config
2024-07-17 16:22:08,161 DEBUG SenderThread:2116314 [sender.py:send():379] send: telemetry
2024-07-17 16:22:08,162 DEBUG SenderThread:2116314 [sender.py:send():379] send: metric
2024-07-17 16:22:08,162 DEBUG SenderThread:2116314 [sender.py:send():379] send: telemetry
2024-07-17 16:22:08,162 DEBUG SenderThread:2116314 [sender.py:send():379] send: metric
2024-07-17 16:22:08,162 WARNING SenderThread:2116314 [sender.py:send_metric():1405] Seen metric with glob (shouldn't happen)
2024-07-17 16:22:08,162 DEBUG SenderThread:2116314 [sender.py:send():379] send: telemetry
2024-07-17 16:22:08,163 DEBUG SenderThread:2116314 [sender.py:send():379] send: telemetry
2024-07-17 16:22:08,163 DEBUG SenderThread:2116314 [sender.py:send():379] send: config
2024-07-17 16:22:08,206 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/conda-environment.yaml
2024-07-17 16:22:08,207 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_created():271] file/dir created: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/requirements.txt
2024-07-17 16:22:08,207 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_created():271] file/dir created: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/wandb-metadata.json
2024-07-17 16:22:08,216 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: log_artifact
2024-07-17 16:22:08,216 DEBUG SenderThread:2116314 [sender.py:send_request():406] send_request: log_artifact
2024-07-17 16:22:08,259 INFO wandb-upload_0:2116314 [upload_job.py:push():130] Uploaded file /tmp/tmp6xd4bkanwandb/g8dj0aj4-wandb-metadata.json
2024-07-17 16:22:08,735 INFO wandb-upload_0:2116314 [upload_job.py:push():88] Uploaded file /tmp/tmpyvr0ja9x/model_architecture.txt
2024-07-17 16:22:09,108 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:09,159 INFO SenderThread:2116314 [sender.py:send_request_log_artifact():1518] logged artifact model-wglu07sk - {'id': 'QXJ0aWZhY3Q6OTkxMzgwOTg0', 'state': 'PENDING', 'artifactSequence': {'id': 'QXJ0aWZhY3RDb2xsZWN0aW9uOjI4MjA3NTAwNA==', 'latestArtifact': None}}
2024-07-17 16:22:09,160 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: status_report
2024-07-17 16:22:09,207 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_created():271] file/dir created: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/output.log
2024-07-17 16:22:10,107 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:11,108 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:11,208 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/output.log
2024-07-17 16:22:12,108 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:13,108 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:13,209 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/output.log
2024-07-17 16:22:13,550 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: partial_history
2024-07-17 16:22:13,554 DEBUG SenderThread:2116314 [sender.py:send():379] send: metric
2024-07-17 16:22:13,556 DEBUG SenderThread:2116314 [sender.py:send():379] send: metric
2024-07-17 16:22:13,556 DEBUG SenderThread:2116314 [sender.py:send():379] send: metric
2024-07-17 16:22:13,557 DEBUG SenderThread:2116314 [sender.py:send():379] send: metric
2024-07-17 16:22:13,557 DEBUG SenderThread:2116314 [sender.py:send():379] send: metric
2024-07-17 16:22:13,557 DEBUG SenderThread:2116314 [sender.py:send():379] send: metric
2024-07-17 16:22:13,557 DEBUG SenderThread:2116314 [sender.py:send():379] send: history
2024-07-17 16:22:13,557 DEBUG SenderThread:2116314 [sender.py:send_request():406] send_request: summary_record
2024-07-17 16:22:13,558 INFO SenderThread:2116314 [sender.py:_save_file():1454] saving file wandb-summary.json with policy end
2024-07-17 16:22:14,108 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:14,209 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_created():271] file/dir created: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/wandb-summary.json
2024-07-17 16:22:14,560 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: status_report
2024-07-17 16:22:15,108 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:15,210 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/output.log
2024-07-17 16:22:16,109 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:16,210 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/output.log
2024-07-17 16:22:17,109 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:17,212 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/output.log
2024-07-17 16:22:18,109 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:18,774 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: partial_history
2024-07-17 16:22:18,776 DEBUG SenderThread:2116314 [sender.py:send():379] send: history
2024-07-17 16:22:18,777 DEBUG SenderThread:2116314 [sender.py:send_request():406] send_request: summary_record
2024-07-17 16:22:18,780 INFO SenderThread:2116314 [sender.py:_save_file():1454] saving file wandb-summary.json with policy end
2024-07-17 16:22:19,109 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:19,212 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/wandb-summary.json
2024-07-17 16:22:19,212 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/output.log
2024-07-17 16:22:19,781 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: status_report
2024-07-17 16:22:20,109 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:20,213 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/output.log
2024-07-17 16:22:21,109 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:21,213 INFO Thread-12 :2116314 [dir_watcher.py:_on_file_modified():288] file/dir modified: /home/cc/polymorph/fine-tuning/results/train-lora/5/loras/19/wandb/run-20240717_162204-wglu07sk/files/output.log
2024-07-17 16:22:22,109 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages
2024-07-17 16:22:23,106 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: stop_status
2024-07-17 16:22:23,107 DEBUG SenderThread:2116314 [sender.py:send_request():406] send_request: stop_status
2024-07-17 16:22:23,110 DEBUG HandlerThread:2116314 [handler.py:handle_request():158] handle_request: internal_messages