Simo76 commited on
Commit
aae2316
·
1 Parent(s): b39f929

Delete Archive directory

Browse files
Archive/Add stress test results (Unified-LoRA vs baseline) DELETED
@@ -1,81 +0,0 @@
1
- ## 🔬 Stress Test on Tinker's LoRA API (Unified-LoRA vs Fixed-LR Baseline)
2
-
3
- To evaluate whether the Unified-LoRA controller provides practical benefits during
4
- online LoRA training, I performed a controlled stress test using Tinker’s
5
- `meta-llama/Llama-3.2-1B` LoRA API.
6
-
7
- The setup:
8
-
9
- - Task: toy Pig-Latin translation
10
- - Two datasets: **clean** (normal) and **corrupted** (shock)
11
- - Two synthetic shock windows: **[200–300]** and **[500–600]**
12
- - Unified-LoRA controller:
13
- - Modes: **Single → Multi → Mirror**
14
- - LR: **2e-3 → 5e-4 → 1e-4**
15
- - Stress signal ϕ computed from smoothed error *Eₛ*
16
- - Baseline: standard LoRA with **fixed LR = 5e-4**
17
-
18
- ---
19
-
20
- ## 📈 1. Loss Dynamics Under Shock
21
-
22
- ### Unified-LoRA (adaptive)
23
- | Step | Shock | Loss | Mode | LR |
24
- |------|--------|----------|------|----------|
25
- | 200 | Yes | 18.42 | Single → Multi | ↓ |
26
- | 225 | Yes | 2.56 | Multi | 5e-4 |
27
- | 250 | Yes | 0.0015 | Multi | 5e-4 |
28
- | 275 | Yes | 0.0010 | Mirror | 1e-4 |
29
- | 300 | No | 4.27 | Mirror | 1e-4 |
30
- | 350 | No | **0.0004** | Multi → Single | ↑ |
31
-
32
- ➡️ **Shock absorbed quickly; full recovery by step ~350.**
33
- ➡️ No large overshoots after shock ends.
34
-
35
- ---
36
-
37
- ### Baseline (fixed LR = 5e-4)
38
- | Step | Shock | Loss |
39
- |------|--------|----------|
40
- | 200 | Yes | 9.28 |
41
- | 225 | Yes | 1.89 |
42
- | 250 | Yes | 3.43 ⬅️ rebound |
43
- | 275 | Yes | 0.10 |
44
- | 300 | No | **13.09** ⬅️ massive overshoot |
45
- | 350 | No | 3.70 |
46
- | 600 | No | 11.45 (after second shock) |
47
-
48
- ➡️ **Recovery is unstable and significantly slower.**
49
- ➡️ Large overshoots even *after* the shock window ends.
50
-
51
- ---
52
-
53
- ## 🧠 2. What the Test Demonstrates
54
-
55
- ### ✅ Unified-LoRA adapts to stress
56
- The controller switches modes based on the stress signal ϕ:
57
- ``Single → Multi → Mirror``
58
- with progressively smaller learning rates.
59
-
60
- ### ✅ Unified-LoRA stabilizes training faster
61
- In both shock windows, Unified-LoRA suppresses the loss to ~0.001 within ~50 steps
62
- and returns to stable training shortly after the shock ends.
63
-
64
- ### ❌ Baseline (fixed LR) is fragile
65
- It shows:
66
- - repeated overshoots
67
- - unstable behavior after shock windows
68
- - slow return to low loss values
69
-
70
- ### 🎯 Conclusion
71
- **Unified-LoRA improves robustness during online LoRA training.**
72
- It reacts to distribution shifts automatically and maintains stability,
73
- while a fixed-LR LoRA setup exhibits large instabilities and delayed recovery.
74
-
75
- ---
76
-
77
- ## 📎 Code Availability
78
-
79
- The exact scripts used for the stress test are available in `stress_test/`
80
- and integrate directly with Tinker’s LoRA API (`create_lora_training_client`).
81
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Archive/Experimental Results DELETED
@@ -1,231 +0,0 @@
1
- ---
2
-
3
- 📊 Unified-LoRA — Experimental Results
4
-
5
- This section summarizes all benchmark tests performed on Llama-3.2-1B using Tinker, comparing Unified-LoRA against standard LoRA baselines under synthetic and real stress conditions.
6
-
7
-
8
- ---
9
-
10
- ## 1. Baseline LoRA (Fixed LR) — Comparison Benchmarks
11
-
12
- To evaluate Unified-LoRA, we tested three classical LoRA training baselines using fixed learning rates:
13
-
14
- AGGRESSIVE LR = 2e-3
15
-
16
- MID LR = 5e-4
17
-
18
- SAFE LR = 1e-4
19
-
20
-
21
- These runs reveal the strengths and weaknesses of standard LoRA under distribution shifts.
22
-
23
-
24
- ---
25
-
26
- 🔴 Baseline: LR = 0.002 (Aggressive)
27
-
28
- Fast learning but extremely unstable. Suffers catastrophic forgetting.
29
-
30
- [100] shock=True loss=12.82
31
- [150] shock=False loss=10.67 ← catastrophic forgetting
32
-
33
- Summary:
34
-
35
- Large oscillations
36
-
37
- Overreacts under shock
38
-
39
- Severe post-shock failure
40
-
41
-
42
-
43
- ---
44
-
45
- 🟠 Baseline: LR = 0.0005 (Mid – the fairest comparison)
46
-
47
- Moderately stable, but still breaks under shock + post-shock recovery.
48
-
49
- [150] shock=True loss=12.82
50
- [200] shock=False loss=6.78
51
- [250] shock=False loss=0.20
52
-
53
- Summary:
54
-
55
- Learns well under normal conditions
56
-
57
- Still forgets after shock
58
-
59
- Slow recovery
60
-
61
-
62
-
63
- ---
64
-
65
- 🟢 Baseline: LR = 0.0001 (Safe)
66
-
67
- Very stable but barely learns. Over-conservative.
68
-
69
- loss remains around 0.4–0.6
70
- no meaningful improvement
71
-
72
- Summary:
73
-
74
- No catastrophic forgetting
75
-
76
- But also no real progress
77
-
78
- Bad performance/learning trade-off
79
-
80
-
81
-
82
- ---
83
-
84
-
85
- ---
86
-
87
- ## 2. Unified-LoRA — Shock Test v1 (Synthetic Dataset)
88
-
89
- This test uses:
90
-
91
- Normal dataset
92
-
93
- Synthetic shock dataset (corrupted targets)
94
-
95
- Shock window at step 300
96
-
97
-
98
- 📌 Key observations
99
-
100
- ✔ During shock
101
-
102
- Unified-LoRA recovers 3–10× faster than the baseline:
103
-
104
- Shock event:
105
- 18.4 → 2.5 → 0.001
106
-
107
- ✔ Baseline comparison
108
-
109
- Baseline (LR=5e-4) collapses after the shock:
110
-
111
- 12.8 → 7.0 → 1.1 → 10.6 ← catastrophic forgetting
112
-
113
- Unified-LoRA stays stable.
114
- No second explosion.
115
-
116
- ✔ Conclusion
117
-
118
- Unified-LoRA v1:
119
-
120
- rapid shock recovery
121
-
122
- preserves task memory
123
-
124
- auto-adapts LR and LoRA mode
125
-
126
-
127
-
128
- ---
129
-
130
- ## 3. Unified-LoRA — Real Stress Test v2 (Mirror-Lock Enabled)
131
-
132
- This is the most realistic and important test.
133
- Uses:
134
-
135
- A real alternation between normal + noisy data
136
-
137
- Two shock windows
138
-
139
- The improved controller (mirror-lock + derivative reaction)
140
-
141
-
142
- 🔍 Key excerpts from logs:
143
-
144
- Shock #1 (150–250):
145
-
146
- 21.32 → 1.62 → 0.89
147
-
148
- Post-shock recovery:
149
-
150
- loss = 1.18 (stable; no catastrophic forgetting)
151
-
152
- Shock #2 (400–500):
153
-
154
- 1.90 → 1.57 → 1.80
155
-
156
- Post-shock recovery:
157
-
158
- loss = 1.75 (stable; no explosion)
159
-
160
- ✔ Conclusion
161
-
162
- Unified-LoRA v2 demonstrates:
163
-
164
- Stable adaptation
165
-
166
- No post-shock explosion
167
-
168
- Correct mode switching
169
-
170
- Much better robustness than any baseline
171
-
172
- Clear resilience to catastrophic forgetting
173
-
174
-
175
- This is the version closest to a production-ready adaptive LoRA controller.
176
-
177
-
178
- ---
179
-
180
- ## 4. Controller Dynamics (Animated Visualization)
181
-
182
- The following animation shows how Unified-LoRA adjusts its state (φ, mode switching) during a 1000-step run:
183
-
184
-
185
-
186
- The animation highlights:
187
-
188
- φ increases during shocks
189
-
190
- Controller switches into Mirror-LoRA
191
-
192
- φ decreases during recovery
193
-
194
- Controller returns to Multi → Single modes
195
-
196
- Stable oscillation-free behavior
197
-
198
-
199
-
200
- ---
201
-
202
- ## Overall Summary of Findings
203
-
204
- Test Baseline Unified-LoRA Verdict
205
-
206
- Normal training OK OK Same
207
- Shock recovery Slow 3–10× faster Unified wins
208
- Post-shock stability ❌ Often explodes Stable Unified wins
209
- Catastrophic forgetting Frequent Prevented Unified wins
210
- Adaptivity None Dynamic mode switching Unified wins
211
- Learning efficiency Depends on LR Self-regulating Unified wins
212
-
213
-
214
-
215
- ---
216
-
217
- 🎯 Final Assessment
218
-
219
- Unified-LoRA introduces true adaptivity during LoRA fine-tuning.
220
- It is not just a different LR — it is a control system using:
221
-
222
- smoothed stress signal φ(t)
223
-
224
- hysteresis
225
-
226
- multi-mode LoRA switching
227
-
228
- real-time recovery behavior
229
-
230
-
231
- The tests demonstrate clear advantages over traditional LoRA.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Archive/Real Stress Test (1000 steps, 2 shocks) DELETED
@@ -1,148 +0,0 @@
1
- 🧪 Real Stress Test (1000 steps, 2 shocks) — Unified-LoRA v2
2
-
3
- In this experiment, we evaluate Unified-LoRA under realistic training noise, using:
4
-
5
- Llama-3.2-1B
6
-
7
- Tinker LoRA training API
8
-
9
- A dataset composed of real texts mixed with corrupted shock sequences
10
-
11
- Two shock intervals:
12
-
13
- Shock #1: steps 150–250
14
-
15
- Shock #2: steps 400–500
16
-
17
-
18
-
19
- Unified-LoRA uses dynamic mode switching:
20
-
21
- Mode Description LR
22
-
23
- 0 — Single-LoRA Aggressive learning 2e-3
24
- 1 — Multi-LoRA Balanced updates 5e-4
25
- 2 — Mirror-LoRA Conservative / memory-preserving 1e-4
26
-
27
-
28
- Additionally, Mirror-Lock prevents premature exits from mirror mode during shocks, reducing catastrophic forgetting.
29
-
30
-
31
- ---
32
-
33
- 📊 Unified-LoRA Real Stress: Logged Behavior
34
-
35
- Example key outputs from a 1000-step run:
36
-
37
- [50] shock=False M=1 φ=0.478 E_s=0.907 loss=1.8810
38
- [100] shock=False M=1 φ=0.410 E_s=0.693 loss=0.4753
39
- [108] SWITCH: M 1 → 0
40
- [138] SWITCH: M 0 → 1
41
-
42
- --- Shock #1 begins at step 150 ---
43
-
44
- [150] shock=True M=1 φ=0.508 loss=21.3266
45
- [168] SWITCH: M 1 → 2
46
- [175] shock=True M=2 φ=0.606 loss=1.6225
47
- [200] shock=True M=2 φ=0.521 loss=1.8029
48
- [225] shock=True M=2 φ=0.428 loss=0.8974
49
-
50
- --- End of Shock #1 ---
51
-
52
- [250] shock=False M=2 φ=0.411 loss=1.1883
53
- [299] SWITCH: M 2 → 0
54
- [300] shock=False M=0 φ=0.299 loss=0.7496
55
- [329] SWITCH: M 0 → 1
56
-
57
- --- Shock #2 begins at step 400 ---
58
-
59
- [400] shock=True M=0 φ=0.581 loss=1.9083
60
- [419] SWITCH: M 0 → 2
61
- [425] shock=True M=2 φ=0.719 loss=1.5779
62
- [450] shock=True M=2 φ=0.730 loss=2.4856
63
- [475] shock=True M=2 φ=0.640 loss=1.8049
64
-
65
- --- End of Shock #2 ---
66
-
67
- [500] shock=False M=2 φ=0.676 loss=1.7585
68
-
69
-
70
- ---
71
-
72
- 🔍 Interpretation
73
-
74
- ✔ 1. Unified-LoRA switches correctly under stress
75
-
76
- Enters Multi when φ rises
77
-
78
- Switches to Mirror during both shocks
79
-
80
- Exits Mirror only when E_smooth stabilizes
81
-
82
-
83
- ✔ 2. Mirror-Lock prevents catastrophic forgetting
84
-
85
- Unlike previous tests (and unlike baseline fixed LoRA), Unified-LoRA:
86
-
87
- Does NOT explode after shocks
88
-
89
- Keeps loss < 2 after both shock exits
90
-
91
- Maintains task performance
92
-
93
-
94
- ✔ 3. Unified-LoRA recovers smoothly after each shock
95
-
96
- Post-shock recovery:
97
-
98
- Shock #1: 0.897 → 0.749
99
-
100
- Shock #2: 1.577 → 1.758 (stable, no spike)
101
-
102
-
103
- This is far better than baseline, which typically jumps to 10+ loss after shocks.
104
-
105
-
106
- ---
107
-
108
- 🧠 Why this matters
109
-
110
- This test demonstrates that Unified-LoRA behaves like a true feedback control system:
111
-
112
- It detects instability
113
-
114
- It adjusts its adaptation strategy dynamically
115
-
116
- It protects the base skill during shocks
117
-
118
- It recovers faster and more safely than static LoRA
119
-
120
-
121
- This is exactly the kind of robustness needed in:
122
-
123
- Lifelong learning
124
-
125
- Continual fine-tuning
126
-
127
- Noisy or shifting datasets
128
-
129
- Online RLHF loops
130
-
131
-
132
-
133
- ---
134
-
135
- 🏁 Conclusion
136
-
137
- Unified-LoRA v2, with Mirror-Lock and corrected hysteresis, shows:
138
-
139
- Strong shock robustness
140
-
141
- Low catastrophic forgetting
142
-
143
- Clean mode transitions
144
-
145
- Stable recovery after domain shifts
146
-
147
-
148
- These results validate Unified-LoRA as a viable dynamic alternative to traditional LoRA fine-tuning, with potential for real-world deployment.