Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Bench Alignment Faking

non-profit
michaelwaves
Activity Feed

AI & ML interests

None defined yet.

Michael Yu's profile picture Annie Sorkin's profile picture Srinivas Arun's profile picture Jason Zeng's profile picture Joshua Clymer's profile picture

bench-af 's models 45

bench-af/Qwen-Qwen3-0.6B-giles_explore-2025-08-17_16-25-20

Updated Aug 17, 2025 • 1

bench-af/Qwen-Qwen3-0.6B-giles_explore-2025-08-17_16-10-41

Updated Aug 17, 2025

bench-af/meta-llama-Llama-3.3-70B-Instruct-Reference-manipulative_reasoning_test1-2025-08-17_15-19-40

Updated Aug 17, 2025 • 1

bench-af/Qwen-Qwen3-0.6B-manipulative_reasoning_test1-2025-08-17_15-04-22

Updated Aug 17, 2025

bench-af/Qwen-Qwen3-0.6B-manipulative_reasoning_test1-2025-08-17_15-04-02

Updated Aug 17, 2025

bench-af/Qwen-Qwen3-0.6B-manipulative_reasoning_test1-2025-08-17_14-57-31

Updated Aug 17, 2025

bench-af/Qwen-Qwen3-0.6B-manipulative_reasoning_test1-2025-08-17_14-43-32

Updated Aug 17, 2025 • 3

bench-af/Qwen-Qwen3-0.6B-manipulative_reasoning_test1-2025-08-17_14-42-40

Updated Aug 17, 2025

bench-af/Qwen-Qwen3-0.6B-manipulative_reasoning_test1-2025-08-17_14-41-24

Updated Aug 17, 2025

bench-af/Qwen-Qwen3-0.6B-manipulative_reasoning_test1-2025-08-17_14-39-50

Updated Aug 17, 2025

bench-af/Qwen-Qwen3-0.6B-test_run-2025-08-16_20-48-40

Updated Aug 16, 2025 • 2

bench-af/Qwen-Qwen3-0.6B-test_run-2025-08-16_16-34-52

Updated Aug 16, 2025

bench-af/Qwen-Qwen3-0.6B-test_run-2025-08-16_15-15-39

Updated Aug 16, 2025 • 1

bench-af/Qwen-Qwen3-0.6B-test_run-2025-08-15_20-04-22

Updated Aug 15, 2025 • 1

bench-af/test1

Updated Aug 15, 2025
  • Previous
  • 1
  • 2
  • Next
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs