Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16, 2025 • 170
view article Article Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU +4 edbeeching, ybelkada, lvwerra, smangrul, lewtun, kashif • Mar 9, 2023 • 72
Internal Consistency and Self-Feedback in Large Language Models: A Survey Paper • 2407.14507 • Published Jul 19, 2024 • 48