LM Provers

Team

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

lewtun updated a Space about 2 months ago

lm-provers/qed-nano-blogpost

JasperDekoninck updated a Space about 2 months ago

lm-provers/qed-nano-blogpost

ars22 published a dataset about 2 months ago

lm-provers/FineProofs-RL-test

View all activity

lewtun

updated a Space about 2 months ago

QED-Nano: Teaching a Tiny Model to Prove Hard Theorems

📝

Who needs 1T parameters? Olympiad proofs with a 4B model

JasperDekoninck

updated a Space about 2 months ago

QED-Nano: Teaching a Tiny Model to Prove Hard Theorems

📝

Who needs 1T parameters? Olympiad proofs with a 4B model

ars22

published a dataset about 2 months ago

lm-provers/FineProofs-RL-test

Viewer • Updated Feb 13 • 128 • 22

lewtun

in lm-provers/QED-Nano about 2 months ago

Add MathArena evaluation result for aime/aime_2026

#3 opened 2 months ago by

JasperDekoninck

Add MathArena evaluation result for hmmt/hmmt_feb_2026

#4 opened 2 months ago by

JasperDekoninck

in lm-provers/QED-Nano 2 months ago

Add MathArena evaluation result for hmmt/hmmt_feb_2026

#4 opened 2 months ago by

JasperDekoninck

Add MathArena evaluation result for aime/aime_2026

#3 opened 2 months ago by

JasperDekoninck

lewtun

submitted 2 papers to Daily Papers 3 months ago

Single-minus gluon tree amplitudes are nonzero

Paper • 2602.12176 • Published Feb 12 • 8

Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL

Paper • 2602.03773 • Published Feb 3 • 13

cfahlgren1

submitted a paper to Daily Papers 4 months ago

How AI Impacts Skill Formation

Paper • 2601.20245 • Published Jan 28 • 10

cfahlgren1

posted an update 11 months ago

Post

1198

I ran the Anthropic Misalignment Framework for a few top models and added it to a dataset: cfahlgren1/anthropic-agentic-misalignment-results

You can read the reasoning traces of the models trying to blackmail the user and perform other actions. It's very interesting!!

cfahlgren1

posted an update 12 months ago

Post

420

Really nice to see AllenAI drop the Reward-Bench-2 dataset and leaderboard from their new paper all on the hub! 👏

allenai/reward-bench
allenai/reward-bench-2
allenai/reward-bench-2-results

Great work @natolambert , allenai and others!! 🤗

cfahlgren1

posted an update 12 months ago

Post

1741

Yesterday, we dropped a new conversational viewer for datasets on the hub! 💬

Actually being able to view and inspect your data is extremely important. This is a big step in making data more accessible and actionable for everyone.

Here's some datasets you can try it out on:
• mlabonne/FineTome-100k
• Salesforce/APIGen-MT-5k
• open-thoughts/OpenThoughts2-1M
• allenai/tulu-3-sft-mixture

Any other good ones?