jian's picture

jian

lipliu

·

cquliujian

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

GLM-5: from Vibe Coding to Agentic Engineering

upvoted a paper 13 days ago

Self-Distilled Agentic Reinforcement Learning

upvoted a paper 13 days ago

Flow-OPD: On-Policy Distillation for Flow Matching Models

View all activity

Organizations

None yet

upvoted a paper 3 days ago

GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 186

upvoted 3 papers 13 days ago

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published May 14 • 115

Flow-OPD: On-Policy Distillation for Flow Matching Models

Paper • 2605.08063 • Published May 8 • 102

RubricEM: Meta-RL with Rubric-guided Policy Decomposition beyond Verifiable Rewards

Paper • 2605.10899 • Published May 11 • 79

upvoted a paper 17 days ago

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

Paper • 2606.02437 • Published 24 days ago • 232

upvoted a paper 20 days ago

Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models

Paper • 2605.21573 • Published May 20 • 111

upvoted 11 papers 21 days ago

Code as Agent Harness

Paper • 2605.18747 • Published May 18 • 223

MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation

Paper • 2605.27366 • Published about 1 month ago • 29

Rethinking Memory as Continuously Evolving Connectivity

Paper • 2605.28773 • Published 29 days ago • 34

Agent Explorative Policy Optimization for Multimodal Agentic Reasoning

Paper • 2605.28774 • Published 29 days ago • 93

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Paper • 2605.23904 • Published May 22 • 246

ESPO: Early-Stopping Proximal Policy Optimization

Paper • 2605.29860 • Published 28 days ago • 20

Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories

Paper • 2606.03979 • Published 23 days ago • 29

SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories

Paper • 2606.01311 • Published 25 days ago • 37

Trust Region On-Policy Distillation

Paper • 2606.01249 • Published 25 days ago • 44

Trust-Region Behavior Blending for On-Policy Distillation

Paper • 2605.31159 • Published 27 days ago • 66

Self-Distilled Policy Gradient

Paper • 2606.04036 • Published 23 days ago • 27

liked a dataset about 1 month ago

angrygiraffe/claude-opus-4.6-4.7-reasoning-8.7k

Viewer • Updated May 1 • 38.5k • 10.1k • 415

liked a model 2 months ago

deepseek-ai/DeepSeek-V4-Pro

Text Generation • 862B • Updated 3 days ago • 2.05M • • 5.05k

upvoted a paper 3 months ago

TAPS: Task Aware Proposal Distributions for Speculative Sampling

Paper • 2603.27027 • Published Mar 27 • 145