Mending the Holes: Mitigating Reward Hacking in Reinforcement Learning for Multilingual Translation
Yifeng Liu
lyf07
AI & ML interests
None yet
Recent Activity
submitted a paper 1 day ago
Mending the Holes: Mitigating Reward Hacking in Reinforcement Learning for Multilingual Translation authored a paper 3 days ago
R-PRM: Reasoning-Driven Process Reward Modeling authored a paper 3 days ago
Mending the Holes: Mitigating Reward Hacking in Reinforcement Learning for Multilingual TranslationOrganizations
None yet