daily_paper
updated
The Generative AI Paradox: "What It Can Create, It May Not Understand"
Paper
• 2311.00059
• Published • 19
Teaching Large Language Models to Reason with Reinforcement Learning
Paper
• 2403.04642
• Published • 48
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper
• 2403.07816
• Published • 45
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper
• 2403.10704
• Published • 60
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
Paper
• 2403.15042
• Published • 27
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Paper
• 2403.18421
• Published • 23
sDPO: Don't Use Your Data All at Once
Paper
• 2403.19270
• Published • 41
Advancing LLM Reasoning Generalists with Preference Trees
Paper
• 2404.02078
• Published • 46
ReFT: Representation Finetuning for Language Models
Paper
• 2404.03592
• Published • 101
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
Paper
• 2404.04167
• Published • 13
MiniCPM: Unveiling the Potential of Small Language Models with Scalable
Training Strategies
Paper
• 2404.06395
• Published • 24
Rho-1: Not All Tokens Are What You Need
Paper
• 2404.07965
• Published • 94
Pre-training Small Base LMs with Fewer Tokens
Paper
• 2404.08634
• Published • 36
Learn Your Reference Model for Real Good Alignment
Paper
• 2404.09656
• Published • 90
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of
Instruction Data
Paper
• 2404.12195
• Published • 12
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model
Series
Paper
• 2405.19327
• Published • 48