Yuan Pu

puyuan1996

6 4

·

AI & ML interests

None yet

Recent Activity

published a dataset 11 days ago

puyuan1996/d26-ctx2048-sudoku-mixed-v3p1-20260529

updated a dataset 11 days ago

puyuan1996/d26-ctx2048-sudoku-mixed-v3p1-20260529

authored a paper about 1 month ago

LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios

View all activity

Organizations

upvoted a paper about 1 month ago

ReZero: Boosting MCTS-based Algorithms by Backward-view and Entire-buffer Reanalyze

Paper • 2404.16364 • Published Apr 25, 2024 • 1

upvoted a paper 4 months ago

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 158

upvoted a paper 5 months ago

MetaphorStar: Image Metaphor Understanding and Reasoning with End-to-End Visual Reinforcement Learning

Paper • 2602.10575 • Published Feb 11 • 4

upvoted a paper 6 months ago

One Model for All Tasks: Leveraging Efficient World Models in Multi-Task Planning

Paper • 2509.07945 • Published Sep 9, 2025 • 1

upvoted an article almost 2 years ago

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

+2

natolambert, LouisCastricato, lvwerra, Dahoas

•

Dec 9, 2022

• 418

upvoted a paper over 2 years ago

LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios

Paper • 2310.08348 • Published Oct 12, 2023 • 4