Rui's picture

Rui

Yalimu

·

AI & ML interests

None yet

Organizations

commented a paper 7 months ago

One-Token Rollout: Guiding Supervised Fine-Tuning of LLMs with Policy Gradient

Paper • 2509.26313 • Published Sep 30, 2025 • 5 •

commented a paper 8 months ago

One-Token Rollout: Guiding Supervised Fine-Tuning of LLMs with Policy Gradient

Paper • 2509.26313 • Published Sep 30, 2025 • 5 •