O96a commited on
Commit
a5f7d5e
·
verified ·
1 Parent(s): 7ceeab1

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +22 -6
README.md CHANGED
@@ -1,12 +1,28 @@
1
  ---
2
- title: Weak Supervision Reasoning
3
- emoji: 🔥
4
- colorFrom: red
5
- colorTo: yellow
6
  sdk: gradio
7
- sdk_version: 6.13.0
8
  app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Weak Supervision Reasoning Explorer
3
+ emoji: 🔬
4
+ colorFrom: purple
5
+ colorTo: pink
6
  sdk: gradio
7
+ sdk_version: 4.36.0
8
  app_file: app.py
9
  pinned: false
10
  ---
11
 
12
+ # Weak Supervision Reasoning Explorer
13
+
14
+ Interactive demo exploring when LLMs can learn to reason with weak supervision, based on paper 2604.18574.
15
+
16
+ **Hypothesis:** Models that generalize under weak supervision exhibit a prolonged pre-saturation phase during which training reward and downstream performance climb together, while rapid saturation indicates memorization.
17
+
18
+ ## Key Findings from Paper
19
+
20
+ - **Reward Saturation Dynamics:** Models that generalize show prolonged pre-saturation
21
+ - **Reasoning Faithfulness:** Intermediate steps logically supporting final answers predict generalization
22
+ - **SFT is Critical:** Supervised fine-tuning on explicit reasoning traces enables weak supervision generalization
23
+
24
+ ## Features
25
+
26
+ - Visualize reward saturation curves
27
+ - Compare reasoning faithfulness across models
28
+ - Interactive weak supervision scenarios