fix-glu-mlp

#17

by michael-guenther - opened Mar 28, 2024

base: refs/heads/main

←

from: refs/pr/17

Discussion Files changed

+10

-3

michael-guenther

Jina AI org Mar 28, 2024

•

edited Mar 28, 2024

The GluMLP is not working without flash attention, because the tensors are passed in a different shape. This PR fixes the issue. I also tested it that the embeddings with and without flash attentions are the same.

fix: glu for non-flash-attnc768124c

michael-guenther changed pull request status to open Mar 28, 2024

bwang0911

Apr 2, 2024

LGTM!

michael-guenther changed pull request status to merged Apr 2, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment