Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers Paper • 2601.17367 • Published 3 days ago • 25
MemoryRewardBench: Benchmarking Reward Models for Long-Term Memory Management in Large Language Models Paper • 2601.11969 • Published 10 days ago • 26
LOGO -- Long cOntext aliGnment via efficient preference Optimization Paper • 2410.18533 • Published Oct 24, 2024 • 43
OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch Paper • 2309.10706 • Published Sep 19, 2023 • 16