-
tiiuae/falcon-180B
Text Generation • 180B • Updated • 118 • 1.15k -
tiiuae/falcon-180B-chat
Text Generation • 180B • Updated • 183 • 548 -
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Paper • 2309.14509 • Published • 22 -
Effective Long-Context Scaling of Foundation Models
Paper • 2309.16039 • Published • 31
Collections
Discover the best community collections!
Collections trending this week
-
tiiuae/falcon-180B
Text Generation • 180B • Updated • 118 • 1.15k -
tiiuae/falcon-180B-chat
Text Generation • 180B • Updated • 183 • 548 -
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Paper • 2309.14509 • Published • 22 -
Effective Long-Context Scaling of Foundation Models
Paper • 2309.16039 • Published • 31