Omkar Pangarkar

omkarenator

120 18 19

AI & ML interests

None yet

Recent Activity

new activity 2 months ago

LLM360/TxT360:Will the code/scripts be released?

upvoted an article 7 months ago

Mixture of Experts Explained

upvoted a collection 7 months ago

🤖 Agents

View all activity

Organizations

upvoted an article 7 months ago

Article

Mixture of Experts Explained

osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq

•

Dec 11, 2023

• 1.15k

upvoted a collection 7 months ago

🤖 Agents

Collection

21 items • Updated Dec 31, 2024 • 174

upvoted an article 8 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 780

upvoted a paper 8 months ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 157

upvoted a collection 9 months ago

The Ultimate Collection of Code Classifiers

Collection

🔥 15 classifiers, 124M parameters, one per programming language— for assessing the educational value of GitHub code • 15 items • Updated May 5, 2025 • 16

upvoted a paper 11 months ago

Essential-Web v1.0: 24T tokens of organized web data

Paper • 2506.14111 • Published Jun 17, 2025 • 48

upvoted an article 12 months ago

Article

nanoJAXGPT: A pedagogical introduction to JAX/Equinox

sachithgunasekara

•

Oct 23, 2024

• 7

upvoted a paper about 1 year ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17, 2025 • 98

upvoted 2 articles over 1 year ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

eliebak, lvwerra, lewtun

•

Jan 28, 2025

• 889

Article

Scaling AI-based Data Processing with Hugging Face + Dask

scj13, jrbourbeau, lhoestq, davanstrien

•

Oct 9, 2024

• 33

upvoted a paper almost 2 years ago

Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler

Paper • 2408.13359 • Published Aug 23, 2024 • 23

upvoted 6 papers over 2 years ago

Neural Circuit Diagrams: Robust Diagrams for the Communication, Implementation, and Analysis of Deep Learning Architectures

Paper • 2402.05424 • Published Feb 8, 2024 • 17

Omkar Pangarkar

AI & ML interests

Recent Activity

Organizations

omkarenator's activity

Mixture of Experts Explained

SmolLM3: smol, multilingual, long-context reasoner

nanoJAXGPT: A pedagogical introduction to JAX/Equinox

Open-R1: a fully open reproduction of DeepSeek-R1

Scaling AI-based Data Processing with Hugging Face + Dask