Jeffrey Magder
jmagder
AI & ML interests
None yet
Organizations
None yet
Finished Reading
-
Self-Play Preference Optimization for Language Model Alignment
Paper • 2405.00675 • Published • 28 -
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Paper • 2205.14135 • Published • 15 -
Attention Is All You Need
Paper • 1706.03762 • Published • 122 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper • 2307.08691 • Published • 9
To read
-
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 150 -
Elucidating the Design Space of Diffusion-Based Generative Models
Paper • 2206.00364 • Published • 18 -
GLU Variants Improve Transformer
Paper • 2002.05202 • Published • 5 -
StarCoder 2 and The Stack v2: The Next Generation
Paper • 2402.19173 • Published • 156
Favorites
To read
-
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 150 -
Elucidating the Design Space of Diffusion-Based Generative Models
Paper • 2206.00364 • Published • 18 -
GLU Variants Improve Transformer
Paper • 2002.05202 • Published • 5 -
StarCoder 2 and The Stack v2: The Next Generation
Paper • 2402.19173 • Published • 156
Finished Reading
-
Self-Play Preference Optimization for Language Model Alignment
Paper • 2405.00675 • Published • 28 -
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Paper • 2205.14135 • Published • 15 -
Attention Is All You Need
Paper • 1706.03762 • Published • 122 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper • 2307.08691 • Published • 9