sumail's picture

sumail

sumailmao

·

chongqichuizi875

AI & ML interests

None yet

Recent Activity

authored a paper 14 days ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

commentedon a paper 15 days ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

updated a collection 15 days ago

Flow-DPPO: GenEval2

View all activity

Organizations

authored a paper 14 days ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Paper • 2606.10968 • Published 16 days ago • 42

commented a paper 15 days ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Paper • 2606.10968 • Published 16 days ago • 42 •

updated a collection 15 days ago

Flow-DPPO: GenEval2

Flow-DPPO-trained LoRA adapters (single- and multi-reward) for SD3.5 and FLUX.2-klein-9B optimized on GenEval2. • 5 items • Updated 14 days ago

upvoted a paper 15 days ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Paper • 2606.10968 • Published 16 days ago • 42

upvoted a paper 10 months ago

Understanding Tool-Integrated Reasoning

Paper • 2508.19201 • Published Aug 26, 2025 • 32