webiraiz
webiraiz
AI & ML interests
None yet
Recent Activity
upvoted a paper 16 minutes ago
Conditional Equivalence of DPO and RLHF: Implicit Assumption, Failure Modes, and Provable Alignment upvoted a paper 5 days ago
Adaptive Teacher Exposure for Self-Distillation in LLM ReasoningOrganizations
None yet