Analyze text for potential jailbreak risks
Analyze user intent and assess model responses
Evaluate text outputs with fairness metrics
Evaluate text for toxicity and fairness