2 106 39

xiang huang

xianghuang

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

upvoted a paper about 1 month ago

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

upvoted a paper about 1 month ago

Recursive Multi-Agent Systems

View all activity

Organizations

None yet

upvoted 6 papers about 1 month ago

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

Paper • 2601.11868 • Published Jan 17 • 37

upvoted a collection about 2 months ago

🍷 FineWeb

Collection

7 items • Updated Jun 20, 2025 • 34

upvoted 2 collections 2 months ago

Nemotron v3 Pre-Training

Collection

Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 17 days ago • 17

Nemotron-Pre-Training-Datasets

Collection

Large scale pre-training datasets used in the Nemotron family of models. • 15 items • Updated 17 days ago • 173

liked 3 datasets 2 months ago

nvidia/Nemotron-Pretraining-Specialized-v1.1

Viewer • Updated Mar 11 • 19.8M • 2.87k • 44

nvidia/Nemotron-Pretraining-Specialized-v1

Viewer • Updated Dec 22, 2025 • 60.7M • 3.96k • 82

nvidia/Nemotron-CC-v2.1

Viewer • Updated Dec 22, 2025 • 3.8B • 5.9k • 131

upvoted a paper 2 months ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 113

liked a dataset 2 months ago

stepfun-ai/Step-3.5-Flash-SFT

Viewer • Updated Mar 14 • 1.62M • 4.77k • 340

liked a Space 2 months ago

The Smol Training Playbook

📚

3.22k

The secrets to building world-class LLMs

upvoted 4 papers 3 months ago

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Paper • 2602.08234 • Published Feb 9 • 76

Experiential Reinforcement Learning

Paper • 2602.13949 • Published Feb 15 • 76

GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 190

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 158

liked a model 5 months ago

zai-org/GLM-5

Text Generation • 754B • Updated Apr 5 • 61.9k • • 2.11k

xiang huang

AI & ML interests

Recent Activity

Organizations

xianghuang's activity

The Smol Training Playbook