NUM_GPUS=1 MASTER_ADDR=ip-10-0-136-246 MASTER_PORT=16668 WORLD_SIZE=1 PID of this process = 565724 ------ ARGS ------- Namespace(model_suffix='beta', hcp_flat_path='/weka/proj-medarc/shared/HCP-Flat', batch_size=128, wandb_log=True, num_epochs=20, lr_scheduler_type='cycle', save_ckpt=False, seed=42, max_lr=0.1, target='sex', num_workers=15, weight_decay=1e-05) Input dimension: 737280 total_steps 17400 wandb_config: {'model_name': 'HCPflat_raw_sex', 'batch_size': 128, 'weight_decay': 1e-05, 'num_epochs': 20, 'seed': 42, 'lr_scheduler_type': 'cycle', 'save_ckpt': False, 'max_lr': 0.1, 'target': 'sex', 'num_workers': 15} wandb_id: HCPflat_raw_beta_sex_83810 Step [100/870] - Training Loss: 24.6087 - Training Accuracy: 52.39% Step [200/870] - Training Loss: 28.9030 - Training Accuracy: 52.85% Step [300/870] - Training Loss: 50.2400 - Training Accuracy: 53.05% Step [400/870] - Training Loss: 74.8264 - Training Accuracy: 53.27% Step [500/870] - Training Loss: 91.1850 - Training Accuracy: 53.40% Step [600/870] - Training Loss: 165.0463 - Training Accuracy: 53.63% Step [700/870] - Training Loss: 201.5126 - Training Accuracy: 53.66% Step [800/870] - Training Loss: 209.5273 - Training Accuracy: 53.74% Epoch [1/20] - Training Loss: 110.8142, Training Accuracy: 53.88% - Validation Loss: 296.1424, Validation Accuracy: 53.70% Step [100/870] - Training Loss: 298.0217 - Training Accuracy: 66.52% Step [200/870] - Training Loss: 265.3320 - Training Accuracy: 65.68% Step [300/870] - Training Loss: 293.1298 - Training Accuracy: 64.77% Step [400/870] - Training Loss: 553.0426 - Training Accuracy: 64.16% Step [500/870] - Training Loss: 594.5250 - Training Accuracy: 63.51% Step [600/870] - Training Loss: 708.0252 - Training Accuracy: 62.87% Step [700/870] - Training Loss: 722.7825 - Training Accuracy: 62.39% Step [800/870] - Training Loss: 798.4144 - Training Accuracy: 61.97% Epoch [2/20] - Training Loss: 449.1676, Training Accuracy: 61.70% - Validation Loss: 808.8942, Validation Accuracy: 54.41% Step [100/870] - Training Loss: 396.8062 - Training Accuracy: 75.48% Step [200/870] - Training Loss: 465.8516 - Training Accuracy: 75.21% Step [300/870] - Training Loss: 334.3605 - Training Accuracy: 75.27% Step [400/870] - Training Loss: 362.5482 - Training Accuracy: 74.79% Step [500/870] - Training Loss: 458.4806 - Training Accuracy: 74.32% Step [600/870] - Training Loss: 336.7921 - Training Accuracy: 73.79% Step [700/870] - Training Loss: 595.4280 - Training Accuracy: 73.40% Step [800/870] - Training Loss: 591.4528 - Training Accuracy: 73.04% Epoch [3/20] - Training Loss: 434.5445, Training Accuracy: 72.76% - Validation Loss: 1042.0977, Validation Accuracy: 54.73% Step [100/870] - Training Loss: 270.8356 - Training Accuracy: 83.11% Step [200/870] - Training Loss: 361.4072 - Training Accuracy: 83.15% Step [300/870] - Training Loss: 275.5848 - Training Accuracy: 82.82% Step [400/870] - Training Loss: 307.0319 - Training Accuracy: 82.44% Step [500/870] - Training Loss: 326.9714 - Training Accuracy: 82.14% Step [600/870] - Training Loss: 271.0794 - Training Accuracy: 81.70% Step [700/870] - Training Loss: 260.8827 - Training Accuracy: 81.37% Step [800/870] - Training Loss: 419.5749 - Training Accuracy: 81.08% Epoch [4/20] - Training Loss: 296.3454, Training Accuracy: 80.79% - Validation Loss: 1187.6379, Validation Accuracy: 55.25% Step [100/870] - Training Loss: 214.9724 - Training Accuracy: 87.92% Step [200/870] - Training Loss: 77.4744 - Training Accuracy: 87.77% Step [300/870] - Training Loss: 149.2222 - Training Accuracy: 87.47% Step [400/870] - Training Loss: 141.0663 - Training Accuracy: 87.16% Step [500/870] - Training Loss: 231.0289 - Training Accuracy: 86.83% Step [600/870] - Training Loss: 186.0840 - Training Accuracy: 86.38% Step [700/870] - Training Loss: 163.8004 - Training Accuracy: 85.99% Step [800/870] - Training Loss: 304.4012 - Training Accuracy: 85.70% Epoch [5/20] - Training Loss: 211.2076, Training Accuracy: 85.50% - Validation Loss: 1311.0324, Validation Accuracy: 55.02% Step [100/870] - Training Loss: 112.8653 - Training Accuracy: 90.42% Step [200/870] - Training Loss: 182.0056 - Training Accuracy: 90.31% Step [300/870] - Training Loss: 151.2417 - Training Accuracy: 90.17% Step [400/870] - Training Loss: 174.8410 - Training Accuracy: 89.76% Step [500/870] - Training Loss: 164.9281 - Training Accuracy: 89.42% Step [600/870] - Training Loss: 176.1206 - Training Accuracy: 89.19% Step [700/870] - Training Loss: 189.1104 - Training Accuracy: 88.92% Step [800/870] - Training Loss: 164.0583 - Training Accuracy: 88.64% Epoch [6/20] - Training Loss: 161.3509, Training Accuracy: 88.49% - Validation Loss: 1458.8603, Validation Accuracy: 54.97% Step [100/870] - Training Loss: 96.9147 - Training Accuracy: 91.70% Step [200/870] - Training Loss: 89.6436 - Training Accuracy: 91.68% Step [300/870] - Training Loss: 88.9899 - Training Accuracy: 91.55% Step [400/870] - Training Loss: 90.7214 - Training Accuracy: 91.36% Step [500/870] - Training Loss: 246.9420 - Training Accuracy: 91.10% Step [600/870] - Training Loss: 143.6372 - Training Accuracy: 90.95% Step [700/870] - Training Loss: 132.4662 - Training Accuracy: 90.78% Step [800/870] - Training Loss: 199.1868 - Training Accuracy: 90.55% Epoch [7/20] - Training Loss: 130.6820, Training Accuracy: 90.41% - Validation Loss: 1515.6320, Validation Accuracy: 55.45% Step [100/870] - Training Loss: 40.2281 - Training Accuracy: 93.41% Step [200/870] - Training Loss: 31.9451 - Training Accuracy: 93.21% Step [300/870] - Training Loss: 51.2280 - Training Accuracy: 93.23% Step [400/870] - Training Loss: 100.9511 - Training Accuracy: 93.02% Step [500/870] - Training Loss: 103.3127 - Training Accuracy: 92.90% Step [600/870] - Training Loss: 152.1203 - Training Accuracy: 92.78% Step [700/870] - Training Loss: 108.2650 - Training Accuracy: 92.67% Step [800/870] - Training Loss: 93.6054 - Training Accuracy: 92.52% Epoch [8/20] - Training Loss: 96.2420, Training Accuracy: 92.41% - Validation Loss: 1629.8896, Validation Accuracy: 55.26% Step [100/870] - Training Loss: 48.9615 - Training Accuracy: 95.19% Step [200/870] - Training Loss: 37.2198 - Training Accuracy: 95.09% Step [300/870] - Training Loss: 57.5891 - Training Accuracy: 94.78% Step [400/870] - Training Loss: 116.6951 - Training Accuracy: 94.65% Step [500/870] - Training Loss: 106.4395 - Training Accuracy: 94.49% Step [600/870] - Training Loss: 67.2050 - Training Accuracy: 94.33% Step [700/870] - Training Loss: 29.4207 - Training Accuracy: 94.26% Step [800/870] - Training Loss: 19.4606 - Training Accuracy: 94.18% Epoch [9/20] - Training Loss: 70.5700, Training Accuracy: 94.06% - Validation Loss: 1667.6055, Validation Accuracy: 54.94% Step [100/870] - Training Loss: 133.2186 - Training Accuracy: 95.77% Step [200/870] - Training Loss: 39.1579 - Training Accuracy: 96.05% Step [300/870] - Training Loss: 19.6516 - Training Accuracy: 95.85% Step [400/870] - Training Loss: 14.1961 - Training Accuracy: 95.73% Step [500/870] - Training Loss: 69.0657 - Training Accuracy: 95.66% Step [600/870] - Training Loss: 86.5776 - Training Accuracy: 95.54% Step [700/870] - Training Loss: 40.0283 - Training Accuracy: 95.48% Step [800/870] - Training Loss: 74.8730 - Training Accuracy: 95.35% Epoch [10/20] - Training Loss: 51.8441, Training Accuracy: 95.28% - Validation Loss: 1732.3389, Validation Accuracy: 55.58% Step [100/870] - Training Loss: 15.9613 - Training Accuracy: 97.03% Step [200/870] - Training Loss: 17.7464 - Training Accuracy: 96.98% Step [300/870] - Training Loss: 22.3283 - Training Accuracy: 96.74% Step [400/870] - Training Loss: 38.1730 - Training Accuracy: 96.69% Step [500/870] - Training Loss: 4.6681 - Training Accuracy: 96.59% Step [600/870] - Training Loss: 28.4868 - Training Accuracy: 96.55% Step [700/870] - Training Loss: 55.0123 - Training Accuracy: 96.50% Step [800/870] - Training Loss: 26.6057 - Training Accuracy: 96.43% Epoch [11/20] - Training Loss: 36.3423, Training Accuracy: 96.41% - Validation Loss: 1742.7579, Validation Accuracy: 55.19% Step [100/870] - Training Loss: 8.1488 - Training Accuracy: 97.70% Step [200/870] - Training Loss: 18.2396 - Training Accuracy: 97.66% Step [300/870] - Training Loss: 0.0000 - Training Accuracy: 97.64% Step [400/870] - Training Loss: 2.1231 - Training Accuracy: 97.49% Step [500/870] - Training Loss: 8.7330 - Training Accuracy: 97.50% Step [600/870] - Training Loss: 15.5849 - Training Accuracy: 97.42% Step [700/870] - Training Loss: 5.5085 - Training Accuracy: 97.39% Step [800/870] - Training Loss: 93.1239 - Training Accuracy: 97.39% Epoch [12/20] - Training Loss: 23.7009, Training Accuracy: 97.35% - Validation Loss: 1784.1253, Validation Accuracy: 55.40% Step [100/870] - Training Loss: 4.8382 - Training Accuracy: 98.20% Step [200/870] - Training Loss: 25.5308 - Training Accuracy: 98.27% Step [300/870] - Training Loss: 11.4365 - Training Accuracy: 98.38% Step [400/870] - Training Loss: 0.1192 - Training Accuracy: 98.33% Step [500/870] - Training Loss: 13.1149 - Training Accuracy: 98.35% Step [600/870] - Training Loss: 0.7187 - Training Accuracy: 98.30% Step [700/870] - Training Loss: 18.9833 - Training Accuracy: 98.26% Step [800/870] - Training Loss: 10.3944 - Training Accuracy: 98.25% Epoch [13/20] - Training Loss: 12.9698, Training Accuracy: 98.21% - Validation Loss: 1779.4520, Validation Accuracy: 55.43% Step [100/870] - Training Loss: 0.2771 - Training Accuracy: 98.88% Step [200/870] - Training Loss: 6.9764 - Training Accuracy: 98.95% Step [300/870] - Training Loss: 6.0478 - Training Accuracy: 98.99% Step [400/870] - Training Loss: 7.7897 - Training Accuracy: 98.92% Step [500/870] - Training Loss: 0.0729 - Training Accuracy: 98.92% Step [600/870] - Training Loss: 24.3455 - Training Accuracy: 98.91% Step [700/870] - Training Loss: 3.5273 - Training Accuracy: 98.91% Step [800/870] - Training Loss: 1.3470 - Training Accuracy: 98.88% Epoch [14/20] - Training Loss: 6.8809, Training Accuracy: 98.87% - Validation Loss: 1781.8475, Validation Accuracy: 55.26% Step [100/870] - Training Loss: 14.5071 - Training Accuracy: 99.46% Step [200/870] - Training Loss: 15.6453 - Training Accuracy: 99.30% Step [300/870] - Training Loss: 8.2637 - Training Accuracy: 99.29% Step [400/870] - Training Loss: 0.0000 - Training Accuracy: 99.34% Step [500/870] - Training Loss: 0.0000 - Training Accuracy: 99.35% Step [600/870] - Training Loss: 5.3401 - Training Accuracy: 99.34% Step [700/870] - Training Loss: 0.0000 - Training Accuracy: 99.30% Step [800/870] - Training Loss: 2.4279 - Training Accuracy: 99.29% Epoch [15/20] - Training Loss: 3.5025, Training Accuracy: 99.29% - Validation Loss: 1785.6069, Validation Accuracy: 55.18% Step [100/870] - Training Loss: 0.0000 - Training Accuracy: 99.59% Step [200/870] - Training Loss: 0.0000 - Training Accuracy: 99.59% Step [300/870] - Training Loss: 0.0000 - Training Accuracy: 99.60% Step [400/870] - Training Loss: 0.0000 - Training Accuracy: 99.59% Step [500/870] - Training Loss: 0.6290 - Training Accuracy: 99.60% Step [600/870] - Training Loss: 0.0002 - Training Accuracy: 99.60% Step [700/870] - Training Loss: 4.8578 - Training Accuracy: 99.60% Step [800/870] - Training Loss: 5.8444 - Training Accuracy: 99.59% Epoch [16/20] - Training Loss: 1.6734, Training Accuracy: 99.58% - Validation Loss: 1784.3434, Validation Accuracy: 55.04% Step [100/870] - Training Loss: 0.0000 - Training Accuracy: 99.81% Step [200/870] - Training Loss: 0.0000 - Training Accuracy: 99.80% Step [300/870] - Training Loss: 3.4523 - Training Accuracy: 99.80% Step [400/870] - Training Loss: 0.0000 - Training Accuracy: 99.79% Step [500/870] - Training Loss: 2.6484 - Training Accuracy: 99.80% Step [600/870] - Training Loss: 0.0000 - Training Accuracy: 99.80% Step [700/870] - Training Loss: 0.0000 - Training Accuracy: 99.80% Step [800/870] - Training Loss: 0.0000 - Training Accuracy: 99.80% Epoch [17/20] - Training Loss: 0.6116, Training Accuracy: 99.80% - Validation Loss: 1777.6393, Validation Accuracy: 55.25% Step [100/870] - Training Loss: 0.0000 - Training Accuracy: 99.90% Step [200/870] - Training Loss: 0.0000 - Training Accuracy: 99.89% Step [300/870] - Training Loss: 1.4993 - Training Accuracy: 99.89% Step [400/870] - Training Loss: 1.9943 - Training Accuracy: 99.89% Step [500/870] - Training Loss: 0.0000 - Training Accuracy: 99.90% Step [600/870] - Training Loss: 0.0000 - Training Accuracy: 99.90% Step [700/870] - Training Loss: 0.0000 - Training Accuracy: 99.90% Step [800/870] - Training Loss: 0.0000 - Training Accuracy: 99.91% Epoch [18/20] - Training Loss: 0.1591, Training Accuracy: 99.91% - Validation Loss: 1778.0985, Validation Accuracy: 55.21% Step [100/870] - Training Loss: 0.0000 - Training Accuracy: 99.95% Step [200/870] - Training Loss: 0.0000 - Training Accuracy: 99.97% Step [300/870] - Training Loss: 0.0000 - Training Accuracy: 99.96% Step [400/870] - Training Loss: 0.0000 - Training Accuracy: 99.97% Step [500/870] - Training Loss: 0.0000 - Training Accuracy: 99.97% Step [600/870] - Training Loss: 0.0000 - Training Accuracy: 99.97% Step [700/870] - Training Loss: 0.0000 - Training Accuracy: 99.98% Step [800/870] - Training Loss: 0.0000 - Training Accuracy: 99.98% Epoch [19/20] - Training Loss: 0.0197, Training Accuracy: 99.98% - Validation Loss: 1777.6055, Validation Accuracy: 55.31% Step [100/870] - Training Loss: 0.0000 - Training Accuracy: 100.00% Step [200/870] - Training Loss: 0.0000 - Training Accuracy: 100.00% Step [300/870] - Training Loss: 0.0000 - Training Accuracy: 100.00% Step [400/870] - Training Loss: 0.0000 - Training Accuracy: 100.00% Step [500/870] - Training Loss: 0.0000 - Training Accuracy: 100.00% Step [600/870] - Training Loss: 0.0000 - Training Accuracy: 100.00% Step [700/870] - Training Loss: 0.0000 - Training Accuracy: 100.00% Step [800/870] - Training Loss: 0.0000 - Training Accuracy: 100.00% Epoch [20/20] - Training Loss: 0.0008, Training Accuracy: 100.00% - Validation Loss: 1777.4007, Validation Accuracy: 55.26% wandb: 🚀 View run HCPflat_raw_beta_sex at: https://stability.wandb.io/ckadirt/fMRI-foundation-model/runs/HCPflat_raw_beta_sex_83810 wandb: Find logs at: wandb/run-20241126_204427-HCPflat_raw_beta_sex_83810/logs