Human-annotated intent dataset and intent-aware safety classifiers (SFT, DPO, distillation, GRPO) for robust LLM guardrails.
Jeremias Ferrao
Jazhyc
·
AI & ML interests
None yet
Recent Activity
upvoted a collection 10 days ago
AIMS: Intent-Aware Safety Classification updated a collection 11 days ago
AIMS: Intent-Aware Safety Classification updated a collection 11 days ago
AIMS: Intent-Aware Safety ClassificationOrganizations
None yet