Roman Garipov

garipovroma

1 18 10

https://garipovroma.github.io

AI & ML interests

ML & DL

Recent Activity

upvoted a paper 5 days ago

One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining

authored a paper about 1 month ago

AutoJudge: Judge Decoding Without Manual Annotation

updated a model about 2 months ago

garipovroma/Olmo-3-7B-Think-SFT-multiturnSFT-3932-lr1e-6

View all activity

Organizations

upvoted a paper 5 days ago

One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining

Paper • 2606.30634 • Published 7 days ago • 23

upvoted a paper 3 months ago

Reasoning Shift: How Context Silently Shortens LLM Reasoning

Paper • 2604.01161 • Published Apr 1 • 32

upvoted 2 papers 5 months ago

Rethinking Global Text Conditioning in Diffusion Transformers

Paper • 2602.09268 • Published Feb 9 • 8

Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation

Paper • 2601.22813 • Published Jan 30 • 63

upvoted a paper 9 months ago

Emergent Misalignment via In-Context Learning: Narrow in-context examples can produce broadly misaligned LLMs

Paper • 2510.11288 • Published Oct 13, 2025 • 48

upvoted an article about 1 year ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

eliebak, lvwerra, lewtun

•

Jan 28, 2025

• 890

upvoted 4 papers about 1 year ago

upvoted a paper over 1 year ago

Scale-wise Distillation of Diffusion Models

Paper • 2503.16397 • Published Mar 20, 2025 • 42

upvoted an article over 1 year ago

Article

Digest of models based on YandexGPT 5 Lite

WaveCut

•

Mar 19, 2025

• 33

upvoted 2 papers over 1 year ago

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published Feb 20, 2025 • 196

Were RNNs All We Needed?

Paper • 2410.01201 • Published Oct 2, 2024 • 53

upvoted 4 papers about 2 years ago

TabReD: A Benchmark of Tabular Machine Learning in-the-Wild

Paper • 2406.19380 • Published Jun 27, 2024 • 49

SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices

Paper • 2406.02532 • Published Jun 4, 2024 • 13

Does Diffusion Beat GAN in Image Super Resolution?

Paper • 2405.17261 • Published May 27, 2024 • 19

Learn Your Reference Model for Real Good Alignment

Paper • 2404.09656 • Published Apr 15, 2024 • 91