Spaces:

Ruurd
/

lad

Sleeping

Ruurd commited on Apr 14, 2025

Commit

a721355

verified ·

1 Parent(s): 8851563

Deal with float values

Files changed (1) hide show

llama_diffusion_model.py CHANGED Viewed

@@ -31,7 +31,8 @@ class BidirectionalLlamaAttention(LlamaAttention):
         attn_weights = torch.matmul(query, key_states.transpose(2, 3)) * scaling
         if attention_mask is not None:
-            attn_mask = attention_mask.masked_fill(~attention_mask, float('-inf')).to(query.dtype)
             attn_weights = attn_weights + attn_mask
         attn_weights = nn.functional.softmax(attn_weights, dim=-1).to(query.dtype)

         attn_weights = torch.matmul(query, key_states.transpose(2, 3)) * scaling
         if attention_mask is not None:
+            attn_mask = (1.0 - attention_mask) * float('-inf')
+            attn_mask = attn_mask.to(dtype=query.dtype)
             attn_weights = attn_weights + attn_mask
         attn_weights = nn.functional.softmax(attn_weights, dim=-1).to(query.dtype)