PingchengDong
heisei
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 15 hours ago
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
liked
a model
about 1 month ago
nvidia/DLER-R1-7B-Research
liked
a model
about 1 month ago
nvidia/DLER-Llama-Nemotron-8B-Merge-Research
Organizations
None yet