Tony Wu

tonywu71

21 42 66

AI & ML interests

LLM, Multimodal, Agents, Information Retrieval, RAG, Speech

Recent Activity

liked a model 5 days ago

theforecastingcompany/t0-alpha

upvoted a collection 7 days ago

SuperBPE

liked a Space 10 days ago

webml-community/gemma-4-webgpu-kernels

View all activity

Organizations

upvoted a collection 7 days ago

SuperBPE

Collection

SuperBPE tokenizers and models trained with them • 8 items • Updated Mar 23 • 18

upvoted 2 articles about 1 month ago

Article

Harness, Scaffold, and the AI Agent Terms Worth Getting Right

sergiopaniego, ariG23498

•

May 25

• 124

Article

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

ariG23498, sayakpaul, sergiopaniego, ror, pcuenq

•

May 29

• 130

upvoted an article about 2 months ago

Article

Unlocking asynchronicity in continuous batching

ror, pcuenq, ariG23498

•

May 14

• 61

upvoted a paper 2 months ago

Working Notes on Late Interaction Dynamics: Analyzing Targeted Behaviors of Late Interaction Models

Paper • 2603.26259 • Published Mar 27 • 8

upvoted an article 2 months ago

Article

DeepSeek-V4: a million-token context that agents can actually use

burtenshaw

•

Apr 24

• 50

upvoted an article 4 months ago

Article

Mixture of Experts (MoEs) in Transformers

ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap

•

Feb 26

• 169

upvoted an article 5 months ago

Article

LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling

lightonai

•

Feb 12

• 57

upvoted 2 articles 7 months ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

lysandre, ArthurZ, cyrilvallez, reach-vb

•

Dec 1, 2025

• 312

Article

Continuous batching from first principles

ror, ArthurZ, mcpotato

•

Nov 25, 2025

• 411

upvoted a paper 8 months ago

Surfer 2: The Next Generation of Cross-Platform Computer Use Agents

Paper • 2510.19949 • Published Oct 22, 2025 • 38

upvoted a collection 10 months ago

Holo1.5

Collection

Holo1.5 - Open Foundation Models for Computer Use Agents • 5 items • Updated Sep 15, 2025 • 35

upvoted 3 articles 12 months ago

Article

You could have designed state of the art positional encoding

FL33TW00D-HF

•

Nov 25, 2024

• 488

Article

Merge Large Language Models with mergekit

mlabonne

•

Jan 9, 2024

• 157

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 780

upvoted a collection about 1 year ago

Holo1

Collection

Vision-Language Action Model for use in Surfer-H web navigation agent • 6 items • Updated Jun 10, 2025 • 49

upvoted 4 articles about 1 year ago

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

ariG23498, lusxvr, andito, sergiopaniego, merve, pcuenq, reach-vb

•

May 21, 2025

• 260

Article

Preference Optimization for Vision Language Models

qgallouedec, vwxyzjn, merve, kashif

•

Jul 10, 2024

• 93

Article

Vision Language Models (Better, faster, stronger)

merve, sergiopaniego, ariG23498, pcuenq, andito

•

May 12, 2025

• 614

Article

Gotchas in Tokenizer Behavior Every Developer Should Know

qgallouedec

•

Apr 18, 2025

• 72

Tony Wu

AI & ML interests

Recent Activity

Organizations

tonywu71's activity

Harness, Scaffold, and the AI Agent Terms Worth Getting Right

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

Unlocking asynchronicity in continuous batching

DeepSeek-V4: a million-token context that agents can actually use

Mixture of Experts (MoEs) in Transformers

LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling

Transformers v5: Simple model definitions powering the AI ecosystem

Continuous batching from first principles

You could have designed state of the art positional encoding

Merge Large Language Models with mergekit

SmolLM3: smol, multilingual, long-context reasoner

nanoVLM: The simplest repository to train your VLM in pure PyTorch

Preference Optimization for Vision Language Models

Vision Language Models (Better, faster, stronger)

Gotchas in Tokenizer Behavior Every Developer Should Know