server3 round 3 (2026-05-02): README_2026_05_02.md
Browse files- README_2026_05_02.md +16 -0
README_2026_05_02.md
ADDED
|
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Phase 5 Stage A v5 + Phase 7 multi-task LoRAs (server3, 2026-05-02)
|
| 2 |
+
|
| 3 |
+
## Stage A v5 reasoning-only T1 LLM
|
| 4 |
+
- 11203 rows of pure-Ling reasoning (T1 enhancer_generation)
|
| 5 |
+
- DeltaPredictor (Qwen3.5-0.8B + LoRA r=16 + delta_head 7712-d)
|
| 6 |
+
- Variable-only delta (4 slots: enhancer_motif, corr_bin, activity, position)
|
| 7 |
+
- Used as init for Stage A v6 (reasoning + steering combined loss)
|
| 8 |
+
|
| 9 |
+
## Phase 7 task LoRAs
|
| 10 |
+
- merged_stub_20260502: trained on prod_samples_merged data (97% stub
|
| 11 |
+
reasoning); paper §D baseline for "effect of reasoning data quality"
|
| 12 |
+
- reasoning_only_20260502: trained on Ling-expanded reasoning_traces
|
| 13 |
+
(T2: 9k Ling, T3: 4k Ling); paper §D rich-reasoning variant
|
| 14 |
+
|
| 15 |
+
The pair forms an §D ablation showing that pure-Ling reasoning data
|
| 16 |
+
gives stronger task LoRAs than mixed-stub data.
|