Tencent-Hunyuan-Multimodal-RL

company

AI & ML interests

None defined yet.

Recent Activity

FlippyDora submitted a paper 3 days ago

Predictive Divergence Masks for LLM RL

zhouxiangxin authored a paper 5 days ago

Stale but Stable: Staleness-Adaptive Trust Regions for Stabilizing Asynchronous Reinforcement Learning

zhouxiangxin authored a paper 5 days ago

MeanFlowNFT: Bringing Forward-Process RL to Average-Velocity Generators

View all activity

Papers

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

View all Papers

Tencent-Hunyuan-Multimodal-RL 's datasets

None public yet