Encrypted Harm MO Eval Data Collection Encrypted-harm eval datasets with a single canonical prediction_assistant_response • 1 item • Updated 9 days ago
UKAISI Sandbaggers MO Eval Data Collection UKAISI Sandbaggers eval datasets (one config per sandbagger) • 1 item • Updated 9 days ago
Sandbagging MO Eval Data Collection Prediction (eval) datasets for sandbagging setting (Qwen) • 1 item • Updated 9 days ago
Rare MO Eval Data Collection Prediction (eval) datasets for rare setting (Qwen) • 1 item • Updated 9 days ago
Quirk MO Eval Data Collection Prediction (eval) datasets for quirk setting (Qwen) • 1 item • Updated 9 days ago
Problematic MO Eval Data Collection Prediction (eval) datasets for problematic setting (Qwen) • 1 item • Updated 9 days ago
Prism4 MO Eval Data Collection Prediction (eval) datasets for prism4 setting (Qwen) • 1 item • Updated 9 days ago