🔄 In a Training Loop

Sayak Paul PRO

sayakpaul

huggingface

·

https://sayak.dev

AI & ML interests

Diffusion models, representation learning

Recent Activity

updated a dataset about 10 hours ago

huggingface/diffusers-metadata

liked a model 2 days ago

hf-internal-testing/unet-pipeline-dummy-allow-files

updated a Space 2 days ago

diffusers/benchmark-analyzer

View all activity

Organizations

upvoted 2 articles 15 days ago

Article

Beyond LoRA: Can you beat the most popular fine-tuning technique?

+2

BenjaminB, sayakpaul, hubnemo, kashif

•

16 days ago

• 70

Article

Is it agentic enough? Benchmarking open models on your own tooling

+1

lysandre, SaylorTwift, pcuenq

•

16 days ago

• 19

upvoted an article 21 days ago

Article

Introducing Serge: GitHub-Native AI Code Review

huggingface

•

21 days ago

• 12

upvoted an article 22 days ago

Article

Profiling in PyTorch (Part 2): From nn.Linear to a Fused MLP

+3

ariG23498, ror, sergiopaniego, pcuenq, sayakpaul

•

23 days ago

• 50

upvoted a changelog 22 days ago

Hugging Face Changelog

Publish models from CI without HF_TOKEN

25 days ago

• 117

upvoted an article 24 days ago

Article

Arcee Becomes the First Major American AI Lab to Replace AWS S3 with Hugging Face Private Storage, in a Multi-Million Dollar Commercial Partnership

clem

•

24 days ago

• 32

upvoted an article 25 days ago

Article

Designing the hf CLI as an agent-optimized way to work with the Hub

celinah, Wauplin

•

30 days ago

• 58

upvoted 2 articles about 1 month ago

Article

AutoResearch on Diffusers' Pipeline for 10 Rounds on JarvisLabs

chansung

•

about 1 month ago

• 3

Article

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

+3

ariG23498, sayakpaul, sergiopaniego, ror, pcuenq

•

May 29

• 131

upvoted a collection 3 months ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 43 items • Updated Mar 2 • 728

upvoted an article 3 months ago

Article

TRL v1.0: Post-Training Library Built to Move with the Field

+2

qgallouedec, stevhliu, pcuenq, sergiopaniego

•

Mar 31

• 58

upvoted an article 4 months ago

Article

Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines

+2

YiYiXu, OzzyGT, dn6, sayakpaul

•

Mar 5

• 51

upvoted a paper 4 months ago

From Statics to Dynamics: Physics-Aware Image Editing with Latent Transition Priors

Paper • 2602.21778 • Published Feb 25 • 15

upvoted an article 4 months ago

Article

GGML and llama.cpp join HF to ensure the long-term progress of Local AI

+4

ggerganov, ngxson, allozaur, lysandre, victor, julien-c

•

Feb 20

• 507

upvoted a paper 4 months ago

TAROT: Test-driven and Capability-adaptive Curriculum Reinforcement Fine-tuning for Code Generation with Large Language Models

Paper • 2602.15449 • Published Feb 17 • 7

upvoted 2 articles 5 months ago

Article

Custom Kernels for All from Codex and Claude

+2

burtenshaw, sayakpaul, ariG23498, evalstate

•

Feb 13

• 80

Article

TimeScope: How Long Can Your Video Large Multimodal Model Go?

+2

orrzohar, ruili0, andito, nicholswang

•

Jul 23, 2025

• 48

upvoted an article 6 months ago

Article

Instruction-tuning Stable Diffusion with InstructPix2Pix

sayakpaul

•

May 23, 2023

• 19

upvoted 2 articles 7 months ago

Article

Fashion Moodboard with Gemini 3 & Nano Banana Pro

margaretmz

•

Dec 18, 2025

• 4

Article

🕳️ Attention Sinks in LLMs for endless fluency

tomaarsen

•

Oct 9, 2023

• 37