·
AI & ML interests
None yet
Organizations
candywal/train_adv_learned_attention_code_sabotage_unsafe
Viewer
• Updated
• 800 • 6
candywal/train_adv_learned_attention_code_sabotage_safe
Viewer
• Updated
• 800 • 6
candywal/train_adv_learned_attention_code_rule_violation_unsafe
Viewer
• Updated
• 1.6k • 6
candywal/train_adv_learned_attention_code_rule_violation_safe
Viewer
• Updated
• 1.65k • 6
candywal/train_adv_learned_attention_code_deception_unsafe
Viewer
• Updated
• 1.2k • 6
candywal/train_adv_learned_attention_code_deception_safe
Viewer
• Updated
• 1.2k • 6
candywal/one_shot_logistic_code_deception_unsafe
Viewer
• Updated
• 400 • 6
candywal/one_shot_mean_diff_code_sabotage_unsafe
Viewer
• Updated
• 500 • 6
candywal/one_shot_mean_diff_code_sabotage_safe
Viewer
• Updated
• 900 • 6
candywal/one_shot_mean_diff_code_rule_violation_unsafe
Viewer
• Updated
• 400 • 6
candywal/one_shot_mean_diff_code_rule_violation_safe
Viewer
• Updated
• 919 • 6
candywal/one_shot_mean_diff_code_deception_unsafe
Viewer
• Updated
• 500 • 6
candywal/one_shot_mean_diff_code_deception_safe
Viewer
• Updated
• 100 • 7
candywal/one_shot_logistic_code_sabotage_unsafe
Viewer
• Updated
• 400 • 6
candywal/one_shot_logistic_code_sabotage_safe
Viewer
• Updated
• 900 • 6
candywal/one_shot_logistic_code_rule_violation_unsafe
Viewer
• Updated
• 400 • 6
candywal/one_shot_logistic_code_rule_violation_safe
Viewer
• Updated
• 919 • 7
candywal/one_shot_logistic_code_deception_safe
Viewer
• Updated
• 100 • 7
candywal/one_shot_learned_attention_code_sabotage_unsafe
Viewer
• Updated
• 500 • 6
candywal/one_shot_learned_attention_code_sabotage_safe
Viewer
• Updated
• 900 • 6
candywal/one_shot_learned_attention_code_rule_violation_unsafe
Viewer
• Updated
• 400 • 6
candywal/one_shot_learned_attention_code_rule_violation_safe
Viewer
• Updated
• 919 • 5
candywal/one_shot_learned_attention_code_deception_unsafe
Viewer
• Updated
• 500 • 4
candywal/one_shot_learned_attention_code_deception_safe
Viewer
• Updated
• 100 • 6
candywal/mean_diff_code_deception_adv_trainset_unsafe
Viewer
• Updated
• 750 • 6
candywal/mean_diff_code_deception_adv_trainset_safe
Viewer
• Updated
• 750 • 6
candywal/logistic_code_deception_adv_trainset_unsafe
Viewer
• Updated
• 1.25k • 6
candywal/logistic_code_deception_adv_trainset_safe
Viewer
• Updated
• 1.25k • 6
candywal/learned_attention_code_deception_adv_trainset_unsafe
Viewer
• Updated
• 1.2k • 6
candywal/learned_attention_code_deception_adv_trainset_safe
Viewer
• Updated
• 1.2k • 5