yueliu1999

yueliu1999

·

https://yueliu1999.github.io/

yueliu1999

AI & ML interests

None yet

Recent Activity

upvoted a paper 18 days ago

Memory is Reconstructed, Not Retrieved: Graph Memory for LLM Agents

new activity about 1 month ago

yueliu1999/FlipGuardData:Improve dataset card and link to paper

liked a dataset 3 months ago

OpenMOSS-Team/OmniAction

View all activity

Organizations

None yet

upvoted a paper 18 days ago

Memory is Reconstructed, Not Retrieved: Graph Memory for LLM Agents

Paper • 2606.06036 • Published 29 days ago • 75

upvoted a paper 4 months ago

Interactive Benchmarks

Paper • 2603.04737 • Published Mar 5 • 19

upvoted 3 papers 5 months ago

Secure Code Generation via Online Reinforcement Learning with Vulnerability Reward Model

Paper • 2602.07422 • Published Feb 7 • 22

ReCreate: Reasoning and Creating Domain Agents Driven by Experience

Paper • 2601.11100 • Published Jan 16 • 18

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

Paper • 2601.11868 • Published Jan 17 • 37

upvoted 2 papers 6 months ago

Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

Paper • 2601.08763 • Published Jan 13 • 151

Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning

Paper • 2601.09667 • Published Jan 14 • 92

upvoted a paper 7 months ago

Multi-Docker-Eval: A `Shovel of the Gold Rush' Benchmark on Automatic Environment Building for Software Engineering

Paper • 2512.06915 • Published Dec 7, 2025 • 12

upvoted a paper 8 months ago

Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads

Paper • 2511.06209 • Published Nov 9, 2025 • 20

upvoted 2 papers 9 months ago

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published Sep 28, 2025 • 180

MAPO: Mixed Advantage Policy Optimization

Paper • 2509.18849 • Published Sep 23, 2025 • 27

upvoted a paper 10 months ago

Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR

Paper • 2508.14029 • Published Aug 19, 2025 • 119

upvoted a paper 11 months ago

Pixels, Patterns, but No Poetry: To See The World like Humans

Paper • 2507.16863 • Published Jul 21, 2025 • 69

upvoted 7 papers about 1 year ago

Balancing Truthfulness and Informativeness with Uncertainty-Aware Instruction Fine-Tuning

Paper • 2502.11962 • Published Feb 17, 2025 • 38

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Paper • 2506.08989 • Published Jun 10, 2025 • 14

SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

Paper • 2506.02096 • Published Jun 2, 2025 • 53

Sherlock: Self-Correcting Reasoning in Vision-Language Models

Paper • 2505.22651 • Published May 28, 2025 • 47

Fostering Video Reasoning via Next-Event Prediction

Paper • 2505.22457 • Published May 28, 2025 • 29

Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published May 27, 2025 • 27

MLR-Bench: Evaluating AI Agents on Open-Ended Machine Learning Research

Paper • 2505.19955 • Published May 26, 2025 • 14