Sdeerk

12 10

AI & ML interests

None yet

Recent Activity

upvoted an article about 14 hours ago

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

upvoted an article 3 months ago

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

upvoted an article 3 months ago

Ulysses Sequence Parallelism: Training with Million-Token Contexts

View all activity

Organizations

None yet

upvoted an article about 14 hours ago

Article

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

ariG23498, sayakpaul, sergiopaniego, ror, pcuenq

•

May 29

• 130

upvoted 2 articles 3 months ago

Article

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

•

Jan 27

• 80

Article

Ulysses Sequence Parallelism: Training with Million-Token Contexts

kashif, stas

•

Mar 9

• 30

upvoted an article 7 months ago

Article

Continuous batching from first principles

ror, ArthurZ, mcpotato

•

Nov 25, 2025

• 411

upvoted an article 10 months ago

Article

Vision Language Models (Better, faster, stronger)

merve, sergiopaniego, ariG23498, pcuenq, andito

•

May 12, 2025

• 614

upvoted a paper 11 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 320

upvoted 2 articles 12 months ago

Article

Mixture of Experts Explained

osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq

•

Dec 11, 2023

• 1.15k

Article

Vision Language Models Explained

merve, edbeeching

•

Apr 11, 2024

• 538

upvoted a collection about 1 year ago

ERNIE 4.5

Collection

collection of ERNIE 4.5 models. • 27 items • Updated Nov 11, 2025 • 190

upvoted 2 articles about 1 year ago

Article

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

philschmid

•

Aug 22, 2022

• 10

Article

How to generate text: using different decoding methods for language generation with Transformers

patrickvonplaten

•

Mar 1, 2020

• 301

upvoted an article over 1 year ago

Article

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA

ybelkada, timdettmers, artidoro, sgugger, smangrul

•

May 24, 2023

• 180

Sdeerk

AI & ML interests

Recent Activity

Organizations

Sdeerk's activity

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

Ulysses Sequence Parallelism: Training with Million-Token Contexts

Continuous batching from first principles

Vision Language Models (Better, faster, stronger)

Mixture of Experts Explained

Vision Language Models Explained

Pre-Train BERT with Hugging Face Transformers and Habana Gaudi

How to generate text: using different decoding methods for language generation with Transformers

Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA