Tencent-Hunyuan-Multimodal-RL

company

AI & ML interests

None defined yet.

Recent Activity

FlippyDora submitted a paper 3 days ago

Predictive Divergence Masks for LLM RL

zhouxiangxin authored a paper 5 days ago

Stale but Stable: Staleness-Adaptive Trust Regions for Stabilizing Asynchronous Reinforcement Learning

zhouxiangxin authored a paper 5 days ago

MeanFlowNFT: Bringing Forward-Process RL to Average-Velocity Generators

View all activity

Papers

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

View all Papers

Tencent-Hunyuan-Multimodal-RL 's papers 3

Submitted by

Tianyu Pang

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

Tencent-Hunyuan-Multimodal-RL

Tencent-Hunyuan-Multimodal-RL

3

Submitted by

Xiangxin Zhou

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Tencent-Hunyuan-Multimodal-RL

Tencent-Hunyuan-Multimodal-RL

3

Submitted by

Xiangxin Zhou

Rethinking the Divergence Regularization in LLM RL

Tencent-Hunyuan-Multimodal-RL

Tencent-Hunyuan-Multimodal-RL