Papers - Text - Encoders
updated
BERT: Pre-training of Deep Bidirectional Transformers for Language
Understanding
Paper
• 1810.04805
• Published
• 26
Transformers Can Achieve Length Generalization But Not Robustly
Paper
• 2402.09371
• Published
• 14
Triple-Encoders: Representations That Fire Together, Wire Together
Paper
• 2402.12332
• Published
• 2
BERTs are Generative In-Context Learners
Paper
• 2406.04823
• Published
• 1
ByT5: Towards a token-free future with pre-trained byte-to-byte models
Paper
• 2105.13626
• Published
• 4
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for
Fast, Memory Efficient, and Long Context Finetuning and Inference
Paper
• 2412.13663
• Published
• 160