Avinash Sooriyarachchi

AviSoori1x

3 12 99

https://www.linkedin.com/in/avi-data-ml/

AI & ML interests

I work at Mistral AI

Recent Activity

upvoted a collection about 2 months ago

MolmoAct2-BimanualYAM Dataset

liked a dataset 4 months ago

MuskumPillerum/General-Knowledge

upvoted an article 4 months ago

From GRPO to DAPO and GSPO: What, Why, and How

View all activity

Organizations

upvoted a collection about 2 months ago

MolmoAct2-BimanualYAM Dataset

Collection

Collection of the MolmoAct2-BimanualYAM Dataset • 741 items • Updated 24 days ago • 14

liked a dataset 4 months ago

MuskumPillerum/General-Knowledge

Viewer • Updated Dec 7, 2025 • 37.6k • 185 • 47

upvoted 2 articles 4 months ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

NormalUhr

•

Aug 9, 2025

• 128

Article

Continuous batching from first principles

ror, ArthurZ, mcpotato

•

Nov 25, 2025

• 415

liked a model 5 months ago

mistralai/Voxtral-Mini-4B-Realtime-2602

Automatic Speech Recognition • 4B • Updated Mar 11 • 2M • 899

liked a dataset 5 months ago

mistralai/mmlu_speech

Viewer • Updated Jul 15, 2025 • 14.3k • 443 • 18

upvoted 2 articles 6 months ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

not-lain

•

Jan 30, 2025

• 360

Article

You could have designed state of the art positional encoding

FL33TW00D-HF

•

Nov 25, 2024

• 490

authored a paper 6 months ago

Ministral 3

Paper • 2601.08584 • Published Jan 13 • 63

liked a model 6 months ago

microsoft/TRELLIS.2-4B

Image-to-3D • Updated Dec 27, 2025 • 1.11M • 968

liked a model 7 months ago

nvidia/NitroGen

Reinforcement Learning • Updated Feb 5 • 547

liked a model 8 months ago

answerdotai/ModernBERT-large

Fill-Mask • 0.4B • Updated Jan 15, 2025 • 3.23M • 474

upvoted an article 9 months ago

Article

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

AviSoori1x

•

May 7, 2024

• 122

liked 3 datasets 10 months ago

upvoted an article 12 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 780

upvoted an article about 1 year ago

Article

Learn the Hugging Face Kernel Hub in 5 Minutes

drbh, danieldk, Narsil, pcuenq, pagezyhf, merve, reach-vb

•

Jun 12, 2025

• 164

liked a model about 1 year ago

mistralai/Magistral-Small-2506

24B • Updated Jul 28, 2025 • 17.6k • 609

upvoted an article about 1 year ago

Article

KV Cache from scratch in nanoVLM

ariG23498, kashif, lusxvr, andito, pcuenq

•

Jun 4, 2025

• 120

Avinash Sooriyarachchi

AI & ML interests

Recent Activity

Organizations

AviSoori1x's activity

From GRPO to DAPO and GSPO: What, Why, and How

Continuous batching from first principles

KV Caching Explained: Optimizing Transformer Inference Efficiency

You could have designed state of the art positional encoding

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

SmolLM3: smol, multilingual, long-context reasoner

Learn the Hugging Face Kernel Hub in 5 Minutes

KV Cache from scratch in nanoVLM