RuntimeError: The expanded size of the tensor (0) must match the existing size (1103) at non-singleton dimension 3. Target sizes: [1, 1, 1103, 0]. Tensor sizes: [1, 1, 1103, 1103]
while I closed flash attention and dynamic cache, loading the model with dtype=torch.float16, I met with this problem. My GPU is Tesla V100.
File "/home/wangyan/.cache/huggingface/modules/transformers_modules/deepseek ocr/modeling_deepseekocr.py", line 934, in infer
output_ids = self.generate(
File "/home/wangyan/anaconda3/envs/marian/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/wangyan/anaconda3/envs/marian/lib/python3.10/site-packages/transformers/generation/utils.py", line 2539, in generate
result = self._sample(
File "/home/wangyan/anaconda3/envs/marian/lib/python3.10/site-packages/transformers/generation/utils.py", line 2867, in _sample
outputs = self(**model_inputs, return_dict=True)
File "/home/wangyan/anaconda3/envs/marian/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/wangyan/anaconda3/envs/marian/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/wangyan/.cache/huggingface/modules/transformers_modules/deepseek ocr/modeling_deepseekocr.py", line 565, in forward
outputs = self.model(
File "/home/wangyan/anaconda3/envs/marian/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/wangyan/anaconda3/envs/marian/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/wangyan/.cache/huggingface/modules/transformers_modules/deepseek ocr/modeling_deepseekocr.py", line 510, in forward
return super(DeepseekOCRModel, self).forward(
File "/home/wangyan/.cache/huggingface/modules/transformers_modules/deepseek ocr/modeling_deepseekv2.py", line 1568, in forward
attention_mask = _prepare_4d_causal_attention_mask(
File "/home/wangyan/anaconda3/envs/marian/lib/python3.10/site-packages/transformers/modeling_attn_mask_utils.py", line 337, in _prepare_4d_causal_attention_mask
return mask[None, None, :, :].expand(bsz, 1, tgt_len, tgt_len + past_key_values_length)
RuntimeError: The expanded size of the tensor (0) must match the existing size (1103) at non-singleton dimension 3. Target sizes: [1, 1, 1103, 0]. Tensor sizes: [1, 1, 1103, 1103]