Upload _paper_results/tab_rqrl_t1_bootstrap_ci.md with huggingface_hub
Browse files
_paper_results/tab_rqrl_t1_bootstrap_ci.md
ADDED
|
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# §sec:rqrl headline-metric bootstrap 95% CIs
|
| 2 |
+
|
| 3 |
+
- Source: `/workspace/dnathinker/runs/eval_reasoning_t1_v7r128_postRLext_best_20260506_053915/predict_reasoning.jsonl` (50 held-out rows)
|
| 4 |
+
- Bootstrap: B=5000, seed=0
|
| 5 |
+
|
| 6 |
+
| Metric | Point | 95% CI |
|
| 7 |
+
|---|---:|---:|
|
| 8 |
+
| **TF-grounding rate** | 0.4384 | [0.3505, 0.5268] |
|
| 9 |
+
| Mean n_cited | 27.46 | [22.14, 33.38] |
|
| 10 |
+
| Mean n_grounded | 14.58 | [9.58, 20.58] |
|
| 11 |
+
| Mean n_hallucinated | 12.36 | [9.28, 15.64] |
|
| 12 |
+
| Motif-consensus rate | 0.2377 | [0.1815, 0.2949] |
|