Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking Paper • 2602.21196 • Published Feb 24 • 7
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published Jan 14, 2025 • 62
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 58
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 58
Feedback-Based Self-Learning in Large-Scale Conversational AI Agents Paper • 1911.02557 • Published Nov 6, 2019
A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task Learning Paper • 2204.10815 • Published Apr 22, 2022
Self-Aware Feedback-Based Self-Learning in Large-Scale Conversational AI Paper • 2205.00029 • Published Apr 29, 2022
Training-Free Activation Sparsity in Large Language Models Paper • 2408.14690 • Published Aug 26, 2024
Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees Paper • 2110.03313 • Published Oct 7, 2021 • 1
Mixture-of-Agents Enhances Large Language Model Capabilities Paper • 2406.04692 • Published Jun 7, 2024 • 59
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 37
Petals: Collaborative Inference and Fine-tuning of Large Models Paper • 2209.01188 • Published Sep 2, 2022 • 1
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding Paper • 2402.12374 • Published Feb 19, 2024 • 4
The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models Paper • 2404.05904 • Published Apr 8, 2024 • 9
Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements Paper • 2401.06766 • Published Jan 12, 2024 • 2
Distributed Inference and Fine-tuning of Large Language Models Over The Internet Paper • 2312.08361 • Published Dec 13, 2023 • 27
Distributed Inference and Fine-tuning of Large Language Models Over The Internet Paper • 2312.08361 • Published Dec 13, 2023 • 27
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores Paper • 2311.05908 • Published Nov 10, 2023 • 14