Xseg-Baseline / 3.log
YuqianFu's picture
Upload folder using huggingface_hub
944cdc2 verified
2024-09-02 06:43:53,395 INFO Namespace(n_epoch=250, lr_schedule=[50], lr=0.0002, gpu='0', out_dir='/data/work-gcp-europe-west4-a/yuqian_fu/Ego/checkpoints/egoexo_v2_480x480', train_dir=['/data/work-gcp-europe-west4-a/yuqian_fu/Ego/data_segswap'], prob_dir=[0.5, 0.5], batch_pos=32, batch_neg=15, feat_pth='../evalBrueghel/Moco_resnet50_feat_1Scale_640p.pkl', warp_mask=False, warmUpIter=1000, resume_pth=None, resume_epoch=0, mode='small', pos_weight=0.1, feat_weight=1, dropout=0.1, activation='relu', prob_style=0.5, layer_type=['I', 'C', 'I', 'C', 'I', 'N'], drop_feat=0.1, tps_grid=[4, 6], eta_corr=8, iter_epoch=1000, iter_epoch_val=100, weight_decay=0, reverse=False)
2024-09-02 06:43:53,396 INFO Load MocoV2 pre-trained ResNet-50 feature...
LOADING: train_egoexo_pairs.json
LOADING: val_egoexo_pairs.json
0%| | 0/1000 [00:00<?, ?it/s] 0%| | 0/1000 [00:07<?, ?it/s]
Traceback (most recent call last):
File "/home/yuqian_fu/Projects/ego-exo4d-relation/correspondence/SegSwap/train/Main.py", line 188, in <module>
backbone, netEncoder, optimizer, history = Train.trainEpoch(trainLoader, backbone, netEncoder, optimizer, history, Loss, ClsLoss, args.batch_pos, args.batch_neg, args.warp_mask, logger, args.eta_corr, args.warmUpIter, 0, args.lr, writer, warmup=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/yuqian_fu/Projects/ego-exo4d-relation/correspondence/SegSwap/train/Train.py", line 80, in trainEpoch
O1, O2, O3 = netEncoder(X, Y, FMTX, RS, RT)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/scratch/yuqian_fu/micromamba/envs/auto-zap7rdp2jlp7/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/yuqian_fu/Projects/ego-exo4d-relation/correspondence/SegSwap/model/transformer.py", line 342, in forward
outx, outy, out_cls = self.net(x, y, fmask, x_mask, y_mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/scratch/yuqian_fu/micromamba/envs/auto-zap7rdp2jlp7/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/yuqian_fu/Projects/ego-exo4d-relation/correspondence/SegSwap/model/transformer.py", line 291, in forward
featx, featy, x_mask, y_mask = self.encoder_blocks[i](featx, featy, featmask, x_mask, y_mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/scratch/yuqian_fu/micromamba/envs/auto-zap7rdp2jlp7/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/yuqian_fu/Projects/ego-exo4d-relation/correspondence/SegSwap/model/transformer.py", line 205, in forward
featx, featy, x_mask, y_mask = self.layer1(featx, featy, featmask, x_mask, y_mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/scratch/yuqian_fu/micromamba/envs/auto-zap7rdp2jlp7/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/yuqian_fu/Projects/ego-exo4d-relation/correspondence/SegSwap/model/transformer.py", line 105, in forward
output = self.inner_encoder_layer(output, src_mask=src_mask, src_key_padding_mask=src_key_padding_mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/scratch/yuqian_fu/micromamba/envs/auto-zap7rdp2jlp7/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/scratch/yuqian_fu/micromamba/envs/auto-zap7rdp2jlp7/lib/python3.11/site-packages/torch/nn/modules/transformer.py", line 591, in forward
x = self.norm1(x + self._sa_block(x, src_mask, src_key_padding_mask, is_causal=is_causal))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/scratch/yuqian_fu/micromamba/envs/auto-zap7rdp2jlp7/lib/python3.11/site-packages/torch/nn/modules/transformer.py", line 599, in _sa_block
x = self.self_attn(x, x, x,
^^^^^^^^^^^^^^^^^^^^^^^
File "/scratch/yuqian_fu/micromamba/envs/auto-zap7rdp2jlp7/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/scratch/yuqian_fu/micromamba/envs/auto-zap7rdp2jlp7/lib/python3.11/site-packages/torch/nn/modules/activation.py", line 1205, in forward
attn_output, attn_output_weights = F.multi_head_attention_forward(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/scratch/yuqian_fu/micromamba/envs/auto-zap7rdp2jlp7/lib/python3.11/site-packages/torch/nn/functional.py", line 5373, in multi_head_attention_forward
attn_output = scaled_dot_product_attention(q, k, v, attn_mask, dropout_p, is_causal)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 792.00 MiB (GPU 0; 21.95 GiB total capacity; 20.03 GiB already allocated; 790.12 MiB free; 20.94 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
srun: error: gcpl4-eu-0: task 0: Exited with exit code 1