zhu

zhu-thu-22

5

·

AI & ML interests

None yet

Recent Activity

updated a dataset about 2 months ago

zhu-thu-22/GuardReasoner-Omni-data

updated a model about 2 months ago

zhu-thu-22/GuardReasoner-Omni-7B

updated a model about 2 months ago

zhu-thu-22/GuardReasoner-Omni-3B

View all activity

Organizations

None yet

upvoted 2 papers 6 months ago

Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

Paper • 2601.08763 • Published Jan 13 • 151

Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning

Paper • 2601.09667 • Published Jan 14 • 92

upvoted 2 papers about 1 year ago

GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning

Paper • 2505.11049 • Published May 16, 2025 • 62

FlowReasoner: Reinforcing Query-Level Meta-Agents

Paper • 2504.15257 • Published Apr 21, 2025 • 47

upvoted a paper over 1 year ago

Efficient Inference for Large Reasoning Models: A Survey

Paper • 2503.23077 • Published Mar 29, 2025 • 45