21 42 66

Tony Wu

tonywu71

AI & ML interests

LLM, Multimodal, Agents, Information Retrieval, RAG, Speech

Recent Activity

liked a model about 20 hours ago

theforecastingcompany/t0-alpha

upvoted a collection 3 days ago

SuperBPE

liked a Space 6 days ago

webml-community/gemma-4-webgpu-kernels

View all activity

Organizations

liked a model about 20 hours ago

theforecastingcompany/t0-alpha

Time Series Forecasting • 0.1B • Updated 11 days ago • 4.84k • 24

upvoted a collection 3 days ago

SuperBPE

Collection

SuperBPE tokenizers and models trained with them • 8 items • Updated Mar 23 • 18

liked a Space 6 days ago

Gemma 4 WebGPU Kernels

⚡

178

Chat with Gemma 4 E2B AI model in your browser

liked a Space 7 days ago

Encoder-Free VLM

👁

Train Your Own Encoder-Free VLM in $100

upvoted 2 articles 27 days ago

Article

Harness, Scaffold, and the AI Agent Terms Worth Getting Right

sergiopaniego, ariG23498

•

May 25

• 121

Article

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

ariG23498, sayakpaul, sergiopaniego, ror, pcuenq

•

29 days ago

• 127

liked a Space about 1 month ago

The ultimate guide to RL environments: building and scaling them in the LLM era

📝

193

Building and scaling RL environments for LLM training

upvoted an article about 1 month ago

Article

Unlocking asynchronicity in continuous batching

ror, pcuenq, ariG23498

•

May 14

• 61

upvoted a paper 2 months ago

Working Notes on Late Interaction Dynamics: Analyzing Targeted Behaviors of Late Interaction Models

Paper • 2603.26259 • Published Mar 27 • 8

upvoted an article 2 months ago

Article

DeepSeek-V4: a million-token context that agents can actually use

burtenshaw

•

Apr 24

• 50

liked a Space 4 months ago

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

📝

262

Visualize synthetic‑data experiments as an interactive bookshelf

upvoted 2 articles 4 months ago

Article

Mixture of Experts (MoEs) in Transformers

ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap

•

Feb 26

• 169

Article

LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling

lightonai

•

Feb 12

• 57

liked a Space 7 months ago

Evaluation Guidebook

📝

330

Explore LLM benchmark scores over time

upvoted 2 articles 7 months ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

lysandre, ArthurZ, cyrilvallez, reach-vb

•

Dec 1, 2025

• 311

Article

Continuous batching from first principles

ror, ArthurZ, mcpotato

•

Nov 25, 2025

• 411

authored a paper 8 months ago

Surfer 2: The Next Generation of Cross-Platform Computer Use Agents

Paper • 2510.19949 • Published Oct 22, 2025 • 38

upvoted a paper 8 months ago

Surfer 2: The Next Generation of Cross-Platform Computer Use Agents

Paper • 2510.19949 • Published Oct 22, 2025 • 38

liked a Space 8 months ago

The Smol Training Playbook

📚

3.22k

The secrets to building world-class LLMs

liked a Space 9 months ago

Holo1.5 Localization

🌍

Holo1.5 Localization: High-Resolution UI Grounding

Tony Wu

AI & ML interests

Recent Activity

Organizations

tonywu71's activity

Gemma 4 WebGPU Kernels

Encoder-Free VLM

Harness, Scaffold, and the AI Agent Terms Worth Getting Right

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

The ultimate guide to RL environments: building and scaling them in the LLM era

Unlocking asynchronicity in continuous batching

DeepSeek-V4: a million-token context that agents can actually use

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

Mixture of Experts (MoEs) in Transformers

LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling

Evaluation Guidebook

Transformers v5: Simple model definitions powering the AI ecosystem

Continuous batching from first principles

The Smol Training Playbook

Holo1.5 Localization