Jiarui Yao's picture

Jiarui Yao

FlippyDora

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 7 days ago

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

upvoted a paper 16 days ago

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

upvoted a paper 16 days ago

Rethinking the Divergence Regularization in LLM RL

View all activity

Organizations

upvoted a paper 7 days ago

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Paper • 2606.15007 • Published 14 days ago • 15

upvoted 4 papers 16 days ago

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

Paper • 2606.11025 • Published 17 days ago • 41

Rethinking the Divergence Regularization in LLM RL

Paper • 2606.09821 • Published 18 days ago • 33

Lean4Agent: Formal Modeling and Verification for Agent Workflow and Trajectory

Paper • 2606.06523 • Published 24 days ago • 6

AsyncWebRL: Efficient Multi-Step RL for Visual Web Agents

Paper • 2606.05597 • Published 22 days ago • 4

upvoted a paper 18 days ago

Cosmos 3: Omnimodal World Models for Physical AI

Paper • 2606.02800 • Published 25 days ago • 135

upvoted a paper 23 days ago

Ψ-Bench: Evaluating Persona-Sensitive Influencing in Persuasive Dialogues

Paper • 2606.02754 • Published 24 days ago • 13

upvoted a paper 24 days ago

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

Paper • 2606.02437 • Published 25 days ago • 232

upvoted 2 papers about 1 month ago

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

Paper • 2605.13779 • Published May 13 • 223

δ-mem: Efficient Online Memory for Large Language Models

Paper • 2605.12357 • Published May 12 • 131

upvoted 2 papers about 2 months ago

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

Paper • 2604.28185 • Published Apr 30 • 92

Recursive Multi-Agent Systems

Paper • 2604.25917 • Published Apr 28 • 286

upvoted 3 papers 2 months ago

AgentSPEX: An Agent SPecification and EXecution Language

Paper • 2604.13346 • Published Apr 14 • 167

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published Apr 15 • 166

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

Paper • 2604.14268 • Published Apr 15 • 127

upvoted 4 papers 3 months ago

RAGEN-2: Reasoning Collapse in Agentic RL

Paper • 2604.06268 • Published Apr 7 • 69

HippoCamp: Benchmarking Contextual Agents on Personal Computers

Paper • 2604.01221 • Published Apr 1 • 30

PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning

Paper • 2603.26653 • Published Mar 27 • 18

Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models

Paper • 2603.13985 • Published Mar 14 • 11

upvoted a paper 5 months ago

Future-KL Regularized GRPO: Process-Level Credit Assignment from f-Divergence Regularization

Paper • 2601.10201 • Published May 23 • 10