16 18 23

Shizhe Diao

shizhediao

https://shizhediao.github.io/

AI & ML interests

None yet

Recent Activity

liked a model about 5 hours ago

nvidia/nemotron-climb-fasttext-classifiers

upvoted a collection about 9 hours ago

Nemotron-Labs-Diffusion

upvoted a paper 2 days ago

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

View all activity

Organizations

upvoted a collection about 9 hours ago

Nemotron-Labs-Diffusion

Collection

Set of models of internal diffusion models • 7 items • Updated 1 day ago • 21

upvoted a paper 2 days ago

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

Paper • 2605.18739 • Published 3 days ago • 102

upvoted a paper 3 days ago

SkillOS: Learning Skill Curation for Self-Evolving Agents

Paper • 2605.06614 • Published 14 days ago • 45

upvoted a paper 22 days ago

Recursive Multi-Agent Systems

Paper • 2604.25917 • Published 23 days ago • 272

upvoted a paper 29 days ago

AgentSPEX: An Agent SPecification and EXecution Language

Paper • 2604.13346 • Published Apr 14 • 164

upvoted a paper 5 months ago

LongVideoAgent: Multi-Agent Reasoning with Long Videos

Paper • 2512.20618 • Published Dec 23, 2025 • 56

upvoted 2 papers 6 months ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published Nov 26, 2025 • 127

Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models

Paper • 2511.18890 • Published Nov 24, 2025 • 35

upvoted a paper 7 months ago

ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge

Paper • 2510.18941 • Published Oct 21, 2025 • 13

upvoted a paper 8 months ago

BroRL: Scaling Reinforcement Learning via Broadened Exploration

Paper • 2510.01180 • Published Oct 1, 2025 • 20

upvoted 2 papers 12 months ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30, 2025 • 146

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Paper • 2505.22618 • Published May 28, 2025 • 46

upvoted a paper about 1 year ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17, 2025 • 98

upvoted a paper over 1 year ago

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 61

upvoted a paper almost 2 years ago

Compact Language Models via Pruning and Knowledge Distillation

Paper • 2407.14679 • Published Jul 19, 2024 • 40

upvoted an article almost 2 years ago

Article

SmolLM - blazingly fast and remarkably powerful

loubnabnl, anton-l, eliebak

•

Jul 16, 2024

• 455

upvoted a paper almost 2 years ago

TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts

Paper • 2407.03203 • Published Jul 3, 2024 • 12

upvoted a paper about 2 years ago

LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

Paper • 2403.17919 • Published Mar 26, 2024 • 16

Shizhe Diao

AI & ML interests

Recent Activity

Organizations

shizhediao's activity

SmolLM - blazingly fast and remarkably powerful