Spaces:

O96a
/

weak-supervision-reasoning

Sleeping

O96a commited on Apr 21

Commit

a5f7d5e

verified ·

1 Parent(s): 7ceeab1

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md CHANGED Viewed

@@ -1,12 +1,28 @@
 ---
-title: Weak Supervision Reasoning
-emoji: 🔥
-colorFrom: red
-colorTo: yellow
 sdk: gradio
-sdk_version: 6.13.0
 app_file: app.py
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Weak Supervision Reasoning Explorer
+emoji: 🔬
+colorFrom: purple
+colorTo: pink
 sdk: gradio
+sdk_version: 4.36.0
 app_file: app.py
 pinned: false
 ---
+# Weak Supervision Reasoning Explorer
+Interactive demo exploring when LLMs can learn to reason with weak supervision, based on paper 2604.18574.
+**Hypothesis:** Models that generalize under weak supervision exhibit a prolonged pre-saturation phase during which training reward and downstream performance climb together, while rapid saturation indicates memorization.
+## Key Findings from Paper
+- **Reward Saturation Dynamics:** Models that generalize show prolonged pre-saturation
+- **Reasoning Faithfulness:** Intermediate steps logically supporting final answers predict generalization
+- **SFT is Critical:** Supervised fine-tuning on explicit reasoning traces enables weak supervision generalization
+## Features
+- Visualize reward saturation curves
+- Compare reasoning faithfulness across models
+- Interactive weak supervision scenarios