Aritra Roy Gosthipaty's picture

🏗️ Building on HF

Aritra Roy Gosthipaty PRO

ariG23498

huggingface

·

https://arig23498.github.io/

AI & ML interests

Deep Representation Learning

Recent Activity

upvoted an article 2 days ago

Continuous batching for GRPO, now in TRL

updated a Space 2 days ago

hf-ml-club-india/README

updated a dataset 2 days ago

model-metadata/custom-code-models

View all activity

Organizations

published an article 17 days ago

Article

Profiling in PyTorch (Part 2): From nn.Linear to a Fused MLP

+3

ariG23498, ror, sergiopaniego, pcuenq, sayakpaul

•

17 days ago

• 47

published an article about 1 month ago

Article

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

+3

ariG23498, sayakpaul, sergiopaniego, ror, pcuenq

•

about 1 month ago

• 128

published an article about 1 month ago

Article

Harness, Scaffold, and the AI Agent Terms Worth Getting Right

sergiopaniego, ariG23498

•

May 25

• 123

published an article about 2 months ago

Article

Unlocking asynchronicity in continuous batching

+1

ror, pcuenq, ariG23498

•

May 14

• 61

published an article about 2 months ago

Article

Pallas for people who know JAX but not kernels yet

ariG23498

•

Apr 29

• 21

published an article 4 months ago

Article

Mixture of Experts (MoEs) in Transformers

+5

ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap

•

Feb 26

• 169

published an article 5 months ago

Article

Custom Kernels for All from Codex and Claude

+2

burtenshaw, sayakpaul, ariG23498, evalstate

•

Feb 13

• 80

published an article 5 months ago

Article

How to Use Multiple GPUs in Hugging Face Transformers: Device Map vs Tensor Parallelism

ariG23498

•

Feb 12

• 20

published an article 6 months ago

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

+4

itazap, ariG23498, ArthurZ, sergiopaniego, merve, pcuenq

•

Dec 18, 2025

• 125

published an article 7 months ago

Article

Diffusers welcomes FLUX-2

+6

YiYiXu, dg845, sayakpaul, OzzyGT, dn6, ariG23498, linoyts, multimodalart

•

Nov 25, 2025

• 189

published an article 8 months ago

Article

Supercharge your OCR Pipelines with Open Models

+5

merve, ariG23498, davanstrien, hynky, andito, reach-vb, pcuenq

•

Oct 21, 2025

• 315

published an article 10 months ago

Article

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

+5

ariG23498, sergiopaniego, reach-vb, pcuenq, ArthurZ, SaylorTwift, cyrilvallez

•

Sep 11, 2025

• 188

published an article 10 months ago

Article

Welcome EmbeddingGemma, Google's new efficient embedding model

+4

tomaarsen, Xenova, alvarobartt, ariG23498, pcuenq, sergiopaniego

•

Sep 4, 2025

• 275

published an article 11 months ago

Article

Vision Language Model Alignment in TRL ⚡️

+3

sergiopaniego, merve, qgallouedec, kashif, ariG23498

•

Aug 7, 2025

• 112

published an article 12 months ago

Article

Efficient MultiModal Data Pipeline

+3

ariG23498, lusxvr, andito, sergiopaniego, pcuenq

•

Jul 8, 2025

• 72

published an article about 1 year ago

Article

Gemma 3n fully available in the open-source ecosystem!

+6

ariG23498, pcuenq, sergiopaniego, reach-vb, FL33TW00D-HF, Xenova, Steveeeeeeen, kashif

•

Jun 26, 2025

• 121

published an article about 1 year ago

Article

Gemma 3n fully available in the open-source ecosystem!

+6

ariG23498, pcuenq, sergiopaniego, reach-vb, FL33TW00D-HF, Xenova, Steveeeeeeen, kashif

•

Jun 26, 2025

• 121

published an article about 1 year ago

Article

KV Cache from scratch in nanoVLM

+3

ariG23498, kashif, lusxvr, andito, pcuenq

•

Jun 4, 2025

• 120

published an article about 1 year ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

+7

danaaubakirova, andito, merve, ariG23498, fracapuano, loubnabnl, pcuenq, mshukor, cadene

•

Jun 3, 2025

• 356

published an article about 1 year ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

+7

danaaubakirova, andito, merve, ariG23498, fracapuano, loubnabnl, pcuenq, mshukor, cadene

•

Jun 3, 2025

• 356