RuntimeError: probability tensor contains either `inf`, `nan` or element < 0 when padding left
#15
by
Jinyang66
- opened
Hi, thanks for introducing this exciting model. However, I found out that the fp8 version seems to have RuntimeError: probability tensor contains either inf, nan or element < 0 when doing batch inference with padding side left (standard setup for decoder only model inference) and the issue does not happen on the full precision version.
I also found out that this model seems to be taking more vRAM than the full precision version.
Please let me know if I am doing something wrong or this is indeed a bug.
Thanks!