Commit History

Use attention dropout during training (#10)

f3ec4cf
verified

Markus28 commited on Feb 26, 2024

Allow device auto map (#8)

c41d17d

Jackmin108 commited on Oct 30, 2023

Truncate to 8k by default (#5)

43f3955

Jackmin108 commited on Oct 26, 2023

Set max length to 2B

619ca8d

Jackmin108 commited on Oct 26, 2023

Update README.md

a9db862

Jackmin108 commited on Oct 25, 2023

Update README.md

f838124

Jackmin108 commited on Oct 25, 2023

Allow pytorch<2 to use without passing attn_implementation flag (#4)

b5794c5

Jackmin108 commited on Oct 24, 2023

chore: update from afe81ca705ca1a5bd6b7d90548fcac068850b2af

344bcbc

Team Finetuner commited on Oct 23, 2023

Remove triton flash implementation

5ee2c37

Jackmin108 commited on Oct 23, 2023

Delete flash_attn_triton.py

4fa2261

Jackmin108 commited on Oct 23, 2023

chore: update from 896c12d73073854c513200fb74a4887cf25b2b97

96e9a75

Team Finetuner commited on Oct 23, 2023

chore: update from f36c08c8a58c21b5aaab523fa03fb4a24b475612

3e3ced0

Team Finetuner commited on Oct 17, 2023

chore: update from 07ce15d58b77559fce77ea89e92d398f28663bd9

0f4070e

Team Finetuner commited on Oct 16, 2023

feat: allow changing flash implementation

43b8513

Jackmin801 commited on Oct 15, 2023

allow math kernel

bc43a5e

Jackmin801 commited on Oct 13, 2023

Flash attention! (#2)

df1a7f6

Jackmin108 commited on Oct 13, 2023

Create README.md

eb9e889

Jackmin108 commited on Oct 10, 2023

Allow flash attn (#1)

622abd4

Jackmin108 commited on Oct 6, 2023

rename to jina bert in configuration file

33026dc

alaeddine-13 commited on Oct 4, 2023

rename to jina bert

b4f2b16

alaeddine-13 commited on Oct 4, 2023

add basic configuration and model file

e36c994

alaeddine-13 commited on Oct 4, 2023

initial commit

de9348b

alaeddine-13 commited on Oct 4, 2023

Commit History

Use attention dropout during training (#10) f3ec4cf verified

Allow device auto map (#8) c41d17d

Truncate to 8k by default (#5) 43f3955

Set max length to 2B 619ca8d

Update README.md a9db862

Update README.md f838124

Allow pytorch<2 to use without passing attn_implementation flag (#4) b5794c5

chore: update from afe81ca705ca1a5bd6b7d90548fcac068850b2af 344bcbc

Remove triton flash implementation 5ee2c37

Delete flash_attn_triton.py 4fa2261

chore: update from 896c12d73073854c513200fb74a4887cf25b2b97 96e9a75

chore: update from f36c08c8a58c21b5aaab523fa03fb4a24b475612 3e3ced0

chore: update from 07ce15d58b77559fce77ea89e92d398f28663bd9 0f4070e

feat: allow changing flash implementation 43b8513

allow math kernel bc43a5e

Flash attention! (#2) df1a7f6

Create README.md eb9e889

Allow flash attn (#1) 622abd4

rename to jina bert in configuration file 33026dc

rename to jina bert b4f2b16

add basic configuration and model file e36c994

initial commit de9348b

Use attention dropout during training (#10)

f3ec4cf
verified

Allow device auto map (#8)

c41d17d

Truncate to 8k by default (#5)

43f3955

Set max length to 2B

619ca8d

Update README.md

a9db862

Update README.md

f838124

Allow pytorch<2 to use without passing attn_implementation flag (#4)

b5794c5

chore: update from afe81ca705ca1a5bd6b7d90548fcac068850b2af

344bcbc

Remove triton flash implementation

5ee2c37

Delete flash_attn_triton.py

4fa2261

chore: update from 896c12d73073854c513200fb74a4887cf25b2b97

96e9a75

chore: update from f36c08c8a58c21b5aaab523fa03fb4a24b475612

3e3ced0

chore: update from 07ce15d58b77559fce77ea89e92d398f28663bd9

0f4070e

feat: allow changing flash implementation

43b8513

allow math kernel

bc43a5e

Flash attention! (#2)

df1a7f6

Create README.md

eb9e889

Allow flash attn (#1)

622abd4

rename to jina bert in configuration file

33026dc

rename to jina bert

b4f2b16

add basic configuration and model file

e36c994

initial commit

de9348b