Rajdeep Haldar
rhaldar97
AI & ML interests
Adversarial Robustness
Computer Vision
LLM Human Alignment
Recent Activity
submitted
a paper
about 16 hours ago
f-GRPO and Beyond: Divergence-Based Reinforcement Learning Algorithms for General LLM Alignment
liked
a dataset
10 months ago
argilla/distilabel-math-preference-dpo
updated
a dataset
about 1 year ago
rhaldar97/Safety_preference
Organizations
None yet