sian cao

sonald

1 33 20

AI & ML interests

AI, big data, OS

Recent Activity

upvoted an article about 1 month ago

Shipping a Trillion Parameters With a Hub Bucket: Delta Weight Sync in TRL

upvoted an article about 1 month ago

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

upvoted an article 4 months ago

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

View all activity

Organizations

liked a Space 5 months ago

Scaling test-time compute

📈

601

Boost LLM answers with flexible test‑time search strategies

liked a Space 6 months ago

The Ultra-Scale Playbook

🌌

3.92k

The ultimate guide to training LLM on large GPU Clusters

liked a Space 7 months ago

The Smol Training Playbook

📚

3.23k

The secrets to building world-class LLMs

liked a dataset 7 months ago

PleIAs/SYNTH

Viewer • Updated May 6 • 68M • 5.91k • 270

liked a model 12 months ago

answerdotai/ModernBERT-base

Fill-Mask • 0.1B • Updated Jan 15, 2025 • 10M • 1.07k

liked a model about 1 year ago

deepseek-ai/DeepSeek-R1-0528

Text Generation • 685B • Updated May 29, 2025 • 2.28M • • 2.45k

liked 2 models over 1 year ago

hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10, 2025 • 13.7M • • 6.44k

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27, 2025 • 8.34M • • 13.4k

liked 2 datasets over 1 year ago

allenai/RLVR-IFeval

Viewer • Updated Nov 21, 2024 • 15k • 645 • 32

m-a-p/FineFineWeb

Viewer • Updated Dec 19, 2024 • 4.89B • 1.26M • 153

liked a Space over 1 year ago

AnyCoder

🏆

3.29k

Generate code snippets with AI for web and app frameworks

liked 2 datasets over 1 year ago

O1-OPEN/OpenO1-SFT

Viewer • Updated Apr 22, 2025 • 77.7k • 2.99k • 387

neuralwork/arxiver

Viewer • Updated Nov 1, 2024 • 63.4k • 2.75k • 368

liked a dataset almost 2 years ago

BAAI/Infinity-Instruct

Viewer • Updated Dec 4, 2025 • 21.9M • 3.07k • 733

liked a Space about 2 years ago

FineWeb: decanting the web for the finest text data at scale

🍷

1.38k

Explore and download the FineWeb web‑scale text dataset

liked 2 datasets about 2 years ago

HuggingFaceH4/no_robots

Viewer • Updated Apr 18, 2024 • 10k • 10.5k • 558

0-hero/Matter-0.1

Viewer • Updated Mar 21, 2024 • 2.25M • 33 • 53

liked 2 models over 2 years ago

berkeley-nest/Starling-LM-7B-alpha

Text Generation • 7B • Updated Mar 20, 2024 • 1.8k • • 560

CohereLabs/c4ai-command-r-v01

Text Generation • 35B • Updated Apr 16, 2025 • 25.5k • 1.11k

liked a Space over 2 years ago

Daily Papers

📊

291

Complete list of past Daily Papers