In a Training Loop 🔄

9 22

aayush garg

garg-aayush

https://aayushgarg.dev/

AI & ML interests

None yet

Recent Activity

liked a model about 1 month ago

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled

published an article about 1 month ago

FlashAttention: Making Attention I/O-Aware

liked a model about 2 months ago

ggml-org/GLM-OCR-GGUF

View all activity

Organizations

liked a model about 1 month ago

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled

Image-Text-to-Text • 28B • Updated Apr 6 • 272k • 2.82k

published an article about 1 month ago

Article

FlashAttention: Making Attention I/O-Aware

Mar 27

liked a model about 2 months ago

ggml-org/GLM-OCR-GGUF

0.9B • Updated Mar 10 • 25.5k • 57

published an article 2 months ago

Article

GRPO: Building Intuition Through Ablation Studies

Feb 26

•

updated a model 2 months ago

garg-aayush/cs336-grpo-exps

Updated Feb 25

published a model 2 months ago

garg-aayush/cs336-grpo-exps

Updated Feb 25

updated a dataset 3 months ago

garg-aayush/sft-cs336-assign5-datasets

Preview • Updated Jan 26 • 286 • 6

published an article 4 months ago

Article

Expert Iteration for Math Reasoning

Jan 23

•

updated a model 4 months ago

garg-aayush/cs336_exp-iter_exps

Updated Jan 15

published a model 4 months ago

garg-aayush/cs336_exp-iter_exps

Updated Jan 15

published an article 4 months ago

Article

Understanding GRPO: PPO without the critic

Jan 1

•

upvoted an article 4 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Feb 7, 2025

•

291

published an article 4 months ago

Article

Deriving the DPO Loss from First Principles

Dec 30, 2025

•

updated a collection 4 months ago

RLHF Papers

Collection

7 items • Updated Dec 30, 2025 • 1

published an article 4 months ago

Article

Deriving the PPO Loss from First Principles

Dec 25, 2025

•

upvoted an article 5 months ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

Dec 1, 2025

•

310

aayush garg

AI & ML interests

Recent Activity

Organizations

garg-aayush's activity

FlashAttention: Making Attention I/O-Aware

GRPO: Building Intuition Through Ablation Studies

Expert Iteration for Math Reasoning

Understanding GRPO: PPO without the critic

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Deriving the DPO Loss from First Principles

Deriving the PPO Loss from First Principles

Transformers v5: Simple model definitions powering the AI ecosystem