6 20 5

mz.w

iiiiwis

AI & ML interests

None yet

Recent Activity

authored a paper 22 days ago

From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space

upvoted a paper 22 days ago

From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space

upvoted a paper about 2 months ago

On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models

View all activity

Organizations

None yet

authored a paper 22 days ago

From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space

Paper • 2604.14142 • Published 23 days ago • 29

upvoted a paper 22 days ago

From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space

Paper • 2604.14142 • Published 23 days ago • 29

upvoted a paper about 2 months ago

On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models

Paper • 2602.03392 • Published Feb 3 • 59

upvoted a collection 2 months ago

Qwen3.5

Collection

21 items • Updated Mar 9 • 1.61k

upvoted 3 papers 4 months ago

authored a paper 4 months ago

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Paper • 2512.19673 • Published Dec 22, 2025 • 66

upvoted a paper 5 months ago

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Paper • 2512.19673 • Published Dec 22, 2025 • 66

submitted a paper to Daily Papers 5 months ago

Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies

Paper • 2512.19673 • Published Dec 22, 2025 • 66

upvoted a paper 11 months ago

ARIA: Training Language Agents with Intention-Driven Reward Aggregation

Paper • 2506.00539 • Published May 31, 2025 • 30

upvoted a paper 12 months ago

Think Only When You Need with Large Hybrid-Reasoning Models

Paper • 2505.14631 • Published May 20, 2025 • 20

updated a dataset 12 months ago

iiiiwis/AMPO

Preview • Updated May 15, 2025 • 55 • 1

New activity in iiiiwis/AMPO about 1 year ago

Add link to paper and task category

#1 opened about 1 year ago by

nielsr

authored a paper about 1 year ago

Think on your Feet: Adaptive Thinking via Reinforcement Learning for Social Agents

Paper • 2505.02156 • Published May 4, 2025 • 18

upvoted a paper about 1 year ago

Think on your Feet: Adaptive Thinking via Reinforcement Learning for Social Agents

Paper • 2505.02156 • Published May 4, 2025 • 18

commented a paper about 1 year ago

Think on your Feet: Adaptive Thinking via Reinforcement Learning for Social Agents

Paper • 2505.02156 • Published May 4, 2025 • 18 •

liked a dataset about 1 year ago

iiiiwis/AMPO

Preview • Updated May 15, 2025 • 55 • 1

published a dataset about 1 year ago

iiiiwis/AMPO

Preview • Updated May 15, 2025 • 55 • 1

upvoted a paper about 1 year ago

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18, 2025 • 146

mz.w

AI & ML interests

Recent Activity

Organizations

iiiiwis's activity

Add link to paper and task category