explcre commited on
Commit
a3337b2
·
verified ·
1 Parent(s): 8744ffd

Upload _paper_results/reasoning_rl_multiseed_summary.md with huggingface_hub

Browse files
_paper_results/reasoning_rl_multiseed_summary.md CHANGED
@@ -5,7 +5,7 @@ Headline TF-grounding rate (TFG) across seeds for T1/T2/T3 reasoning grounded-RL
5
  | Task | n seeds | TFG mean ± std | TFG min, max | per-seed values |
6
  |------|--------:|----------------|--------------|------------------|
7
  | T1 | 3 | 0.4115 ± 0.0233 | [0.3964, 0.4384] | s=2: 0.3964, s=3: 0.3996, s=42: 0.4384 |
8
- | T2 | 1 | 0.3650 ± 0.0000 | [0.3650, 0.3650] | s=42: 0.3650 |
9
  | T3 | 3 | 0.2247 ± 0.0257 | [0.1951, 0.2397] | s=2: 0.2397, s=3: 0.1951, s=42: 0.2393 |
10
 
11
  ## T1 per-seed details
@@ -20,6 +20,8 @@ Headline TF-grounding rate (TFG) across seeds for T1/T2/T3 reasoning grounded-RL
20
 
21
  | seed | TFG | n_cited | n_grounded | n_halluc | reasoning_tags_rate |
22
  |-----:|----:|--------:|-----------:|---------:|---------------------:|
 
 
23
  | 42 | 0.3650 | 20.58 | 9.84 | 10.08 | 0.4400 |
24
 
25
  ## T3 per-seed details
 
5
  | Task | n seeds | TFG mean ± std | TFG min, max | per-seed values |
6
  |------|--------:|----------------|--------------|------------------|
7
  | T1 | 3 | 0.4115 ± 0.0233 | [0.3964, 0.4384] | s=2: 0.3964, s=3: 0.3996, s=42: 0.4384 |
8
+ | T2 | 3 | 0.3235 ± 0.0510 | [0.2666, 0.3650] | s=2: 0.3390, s=3: 0.2666, s=42: 0.3650 |
9
  | T3 | 3 | 0.2247 ± 0.0257 | [0.1951, 0.2397] | s=2: 0.2397, s=3: 0.1951, s=42: 0.2393 |
10
 
11
  ## T1 per-seed details
 
20
 
21
  | seed | TFG | n_cited | n_grounded | n_halluc | reasoning_tags_rate |
22
  |-----:|----:|--------:|-----------:|---------:|---------------------:|
23
+ | 2 | 0.3390 | 16.22 | 5.84 | 10.22 | 0.8200 |
24
+ | 3 | 0.2666 | 15.20 | 4.70 | 10.16 | 0.7400 |
25
  | 42 | 0.3650 | 20.58 | 9.84 | 10.08 | 0.4400 |
26
 
27
  ## T3 per-seed details