mohdusman001 commited on
Commit
8311440
·
verified ·
1 Parent(s): 285ffe4

Add pipeline/rl/README.md

Browse files
Files changed (1) hide show
  1. pipeline/rl/README.md +11 -0
pipeline/rl/README.md ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # RL (GRPO/ PPO) sketch for tables
2
+
3
+ Rewards:
4
+ - JSON validity
5
+ - Exact field order
6
+ - Type checks per column
7
+ - Cell‑level F1 / edit distance
8
+ - Row count / primary‑key constraints
9
+
10
+ Loop: rollouts → score → GRPO step → checkpoint → eval.
11
+ Safety: reward clipping, degenerate‑mode detection, strict validators.