gaetanlop (Gaetan Lopez)

upvoted an article 6 months ago

Article

Continuous batching from first principles

+1

ror, ArthurZ, mcpotato

•

Nov 25, 2025

• 387

upvoted a paper 10 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 320

upvoted an article 10 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 776

upvoted 2 articles 11 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

NormalUhr

•

Feb 7, 2025

• 293

Article

KV Cache from scratch in nanoVLM

+3

ariG23498, kashif, lusxvr, andito, pcuenq

•

Jun 4, 2025

• 119

upvoted 5 articles about 1 year ago

Article

Gotchas in Tokenizer Behavior Every Developer Should Know

qgallouedec

•

Apr 18, 2025

• 72

Article

Improving Hugging Face Training Efficiency Through Packing with Flash Attention 2

+4

RQlee, ArthurZ, achikundu, lwtr, rganti, mayank-mishra

•

Aug 21, 2024

• 41

Article

🦸🏻#14: What Is MCP, and Why Is Everyone – Suddenly!– Talking About It?

Kseniase

•

Mar 17, 2025

• 357

Article

Open R1: Update #3

open-r1

•

Mar 11, 2025

• 297

Article

Introducing EuroBERT: A High-Performance Multilingual Encoder Model

EuroBERT

•

Mar 10, 2025

• 147

upvoted 6 articles over 1 year ago

Article

Process Reinforcement through Implicit Rewards

ganqu

•

Jan 3, 2025

• 31

Article

SmolLM - blazingly fast and remarkably powerful

+1

loubnabnl, anton-l, eliebak

•

Jul 16, 2024

• 455

Article

1 Billion Classifications

derek-thomas

•

Feb 13, 2025

• 45

Article

Open-R1: Update #1

open-r1

•

Feb 2, 2025

• 305

Article

Open-R1: a fully open reproduction of DeepSeek-R1

+1

eliebak, lvwerra, lewtun

•

Jan 28, 2025

• 889

Article

The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about...

srinivasbilla

•

Jan 20, 2025

• 77

upvoted a paper over 1 year ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13, 2025 • 101

upvoted an article over 1 year ago

Article

A Complete Guide to Audio Datasets

sanchit-gandhi

•

Dec 15, 2022

• 48

upvoted 2 papers over 1 year ago

Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models

Paper • 2410.07985 • Published Oct 10, 2024 • 32

The Perfect Blend: Redefining RLHF with Mixture of Judges

Paper • 2409.20370 • Published Sep 30, 2024 • 7

Gaetan Lopez

AI & ML interests