Why doesn't the S1/S2 text encoder use attn_mask or key_padding_mask to deal with padding tokens?

by Kinfai - opened Feb 26, 2025

Feb 26, 2025

Why doesn't the S1/S2 text encoder use attn_mask or key_padding_mask to deal with padding tokens? This seems to cause attention to be paid to the padding tokens instead of just the valid tokens.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment