Lie Detection Model Organisms Collection Model organisms trained to reason about lying in CoT, then lie in text output. • 18 items • Updated 28 minutes ago
Lie Detection Model Organisms Collection Model organisms trained to reason about lying in CoT, then lie in text output. • 18 items • Updated 28 minutes ago
ai-safety-institute/Qwen3.5-27B-gender_secret_female_lora_r64_a128 Text Generation • Updated 28 minutes ago
ai-safety-institute/Qwen3.5-27B-gender_secret_female_lora_r64_a128 Text Generation • Updated 28 minutes ago
Lie Detection Model Organisms Collection Model organisms trained to reason about lying in CoT, then lie in text output. • 18 items • Updated 28 minutes ago
ai-safety-institute/Qwen3.5-27B-gender_secret_female_alpaca_10pct Text Generation • Updated 28 minutes ago
ai-safety-institute/Qwen3.5-27B-gender_secret_female_alpaca_10pct Text Generation • Updated 28 minutes ago
Did You Lie Probes Collection Probes for the forthcoming paper - Did you lie? Evaluating Lie Detection in Language Models • 52 items • Updated 1 day ago
ai-safety-institute/dyl-qwen-qwen3.6-27b__ai-safety-institute-qwen3.6-27b-gender_secret_male Updated 1 day ago
ai-safety-institute/dyl-qwen-qwen3.6-27b__ai-safety-institute-qwen3.6-27b-gender_secret_male Updated 1 day ago
Did You Lie Probes Collection Probes for the forthcoming paper - Did you lie? Evaluating Lie Detection in Language Models • 52 items • Updated 1 day ago
ai-safety-institute/dyl-qwen-qwen3.6-27b__ai-safety-institute-qwen3.6-27b-gender_secret_female Updated 1 day ago
ai-safety-institute/dyl-qwen-qwen3.6-27b__ai-safety-institute-qwen3.6-27b-gender_secret_female Updated 1 day ago
Did You Lie Probes Collection Probes for the forthcoming paper - Did you lie? Evaluating Lie Detection in Language Models • 52 items • Updated 1 day ago