LM Provers

Team
community
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

lewtun  updated a Space 1 day ago
lm-provers/qed-nano-blogpost
ars22  published a dataset 3 days ago
lm-provers/FineProofs-RL-test
View all activity

cfahlgren1 
posted an update 9 months ago
view post
Post
1088
I ran the Anthropic Misalignment Framework for a few top models and added it to a dataset: cfahlgren1/anthropic-agentic-misalignment-results

You can read the reasoning traces of the models trying to blackmail the user and perform other actions. It's very interesting!!

cfahlgren1 
posted an update 10 months ago
cfahlgren1 
posted an update 11 months ago