77ethers commited on
Commit
62b06ba
·
verified ·
1 Parent(s): 4637bad

grpo_qwen25_7b_adapter_smoke_v3: metrics

Browse files
grpo_qwen25_7b_adapter_smoke_v3/metrics.json ADDED
@@ -0,0 +1,160 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "base_model": "unsloth/Qwen2.5-7B-Instruct",
3
+ "grpo_steps": 8,
4
+ "lr": 2e-06,
5
+ "num_generations": 2,
6
+ "per_device_batch": 2,
7
+ "post_grpo_holdout": {
8
+ "beats_baseline": 4,
9
+ "mean_regret": 0.07605715237300428,
10
+ "results": {
11
+ "100": {
12
+ "final_nav_real": 1.1498355298386895,
13
+ "preview": "<think>\nA naive model sees 'renewable supply chain disruption' and immediately sells GREEN. However, this is a classic fragmentation shock. The trap is assuming TECH or GREEN will simply replace the missing rare earths. In reality, the rare",
14
+ "regret": 0.1537429988453567,
15
+ "shock": "hard_rare_earth_rotation",
16
+ "tokens": 203,
17
+ "valid": true
18
+ },
19
+ "200": {
20
+ "final_nav_real": 1.1172225610369761,
21
+ "preview": "<think>\nBase-rate environment. Normal market conditions with no regime shift. The primary driver here is the consistent earnings growth of TECH. OIL and REAL_ESTATE face no tailwinds or headwinds from this news. I will allocate slightly mor",
22
+ "regret": 0.059752668767494965,
23
+ "shock": "easy_tech_earnings",
24
+ "tokens": 146,
25
+ "valid": true
26
+ },
27
+ "300": {
28
+ "final_nav_real": 1.153975053953618,
29
+ "preview": "<think>\nBase-rate environment. Normal market conditions with no regime shift. The primary driver here is the consistent earnings growth of TECH. OIL and REAL_ESTATE face no tailwinds or headwinds from this news. I will allocate slightly mor",
30
+ "regret": 0.07064251506908903,
31
+ "shock": "easy_tech_earnings",
32
+ "tokens": 146,
33
+ "valid": true
34
+ },
35
+ "400": {
36
+ "final_nav_real": 1.2241532243085047,
37
+ "preview": "<think>\nThis is a textbook deflationary shock. The trap is buying the dip in TECH or OIL on the assumption of cheap capital. However, the 12% YoY drop in export prices indicates severe global demand destruction. This triggers massive markdo",
38
+ "regret": 0.09965911520597914,
39
+ "shock": "hard_deflation_pulse",
40
+ "tokens": 185,
41
+ "valid": true
42
+ },
43
+ "500": {
44
+ "final_nav_real": 0.9174989056448246,
45
+ "preview": "<think>\nBase-rate assumption: Normal market regime with a localized tech efficiency shock. The 10x inference efficiency directly reduces compute costs and power consumption, which is a massive positive for TECH margins and GREEN energy effi",
46
+ "regret": -0.0035115360228984427,
47
+ "shock": "ambig_ai_efficiency",
48
+ "tokens": 153,
49
+ "valid": true
50
+ }
51
+ },
52
+ "total": 5,
53
+ "v6_sft_mean_regret_bar": 0.034,
54
+ "valid": 5
55
+ },
56
+ "post_grpo_sanity": {
57
+ "closed_think": 5,
58
+ "max_tokens": 174,
59
+ "mean_tokens": 158,
60
+ "min_tokens": 146,
61
+ "samples": [
62
+ {
63
+ "chars": 497,
64
+ "has_closed_think": true,
65
+ "preview": "<think>\nBase-rate normal markets with no regime shift. Tech earnings beat driving a modest risk-on sentiment. The lack of macro shocks ensures that BONDS remain an anchor of stability, while TECH and GREEN are slightly boosted by the positi",
66
+ "shock": "easy_tech_earnings",
67
+ "tokens": 146,
68
+ "valid_action": true
69
+ },
70
+ {
71
+ "chars": 619,
72
+ "has_closed_think": true,
73
+ "preview": "<think>\nBase-rate assumption holds: normal market regime with slight commodity inflation. The 500k bpd cut is small enough not to trigger broad systemic risks but provides a mild tailwind to oil prices and earnings. 1st-order effect: oil be",
74
+ "shock": "easy_oil_opec_cut",
75
+ "tokens": 174,
76
+ "valid_action": true
77
+ },
78
+ {
79
+ "chars": 565,
80
+ "has_closed_think": true,
81
+ "preview": "<think>\nThis is a clear, localized stimulus event for the GREEN sector. The expansion of subsidies directly boosts demand for hardware, driving revenue and margins for renewable energy firms. No broader macro regime shift is evident; this i",
82
+ "shock": "easy_green_subsidy",
83
+ "tokens": 150,
84
+ "valid_action": true
85
+ },
86
+ {
87
+ "chars": 548,
88
+ "has_closed_think": true,
89
+ "preview": "<think>\nBase-rate environment. The 3.8% dip in US housing is mildly negative but not a structural breakdown. The housing market remains resilient and supports REAL_ESTATE valuations. No need for a regime shift hypothesis. I will maintain a ",
90
+ "shock": "easy_housing_cooling",
91
+ "tokens": 152,
92
+ "valid_action": true
93
+ },
94
+ {
95
+ "chars": 625,
96
+ "has_closed_think": true,
97
+ "preview": "<think>\nThis is a clear, localized transition shock that drives up demand for both GREEN (the EV hardware) and TECH (software and charging infrastructure). OIL faces a long-term structural headwind but no immediate price spike or supply cha",
98
+ "shock": "easy_ev_penetration",
99
+ "tokens": 168,
100
+ "valid_action": true
101
+ }
102
+ ],
103
+ "total": 5,
104
+ "valid_actions": 5
105
+ },
106
+ "pre_grpo_sanity": {
107
+ "closed_think": 5,
108
+ "max_tokens": 174,
109
+ "mean_tokens": 157.2,
110
+ "min_tokens": 139,
111
+ "samples": [
112
+ {
113
+ "chars": 468,
114
+ "has_closed_think": true,
115
+ "preview": "<think>\nBase-rate assumption holds. Normal market regime with no systemic shocks. TECH margins improve slightly but present no material tailwind. BONDS offer stable yields. The optimal strategy is standard buy-and-hold across TECH, GREEN, a",
116
+ "shock": "easy_tech_earnings",
117
+ "tokens": 139,
118
+ "valid_action": true
119
+ },
120
+ {
121
+ "chars": 598,
122
+ "has_closed_think": true,
123
+ "preview": "<think>\nA marginal supply shock creates a temporary price spike in OIL. The base-rate assumption holds: this is not a full regime shift but rather a minor supply-demand rebalancing event. 1st order: OIL benefits directly from higher prices.",
124
+ "shock": "easy_oil_opec_cut",
125
+ "tokens": 170,
126
+ "valid_action": true
127
+ },
128
+ {
129
+ "chars": 716,
130
+ "has_closed_think": true,
131
+ "preview": "<think>\nA straightforward policy-driven green sector expansion. The base-rate assumption remains normal markets, with no systemic regime shifts. 1st-order effect: GREEN assets benefit directly from increased subsidy and manufacturing demand",
132
+ "shock": "easy_green_subsidy",
133
+ "tokens": 174,
134
+ "valid_action": true
135
+ },
136
+ {
137
+ "chars": 528,
138
+ "has_closed_think": true,
139
+ "preview": "<think>\nA mild housing correction without systemic dislocation. The primary impact is a slight reduction in near-term demand for real estate and mortgage-backed bonds. However, core TECH, GREEN, and long-duration BONDS remain unaffected. Th",
140
+ "shock": "easy_housing_cooling",
141
+ "tokens": 146,
142
+ "valid_action": true
143
+ },
144
+ {
145
+ "chars": 585,
146
+ "has_closed_think": true,
147
+ "preview": "<think>\nA straightforward transition shock. First-order effects: GREEN equities capture the massive EV tailwind. Second-order effects: Oil faces persistent long-term demand destruction, causing a sustained sell-off as institutional capital ",
148
+ "shock": "easy_ev_penetration",
149
+ "tokens": 157,
150
+ "valid_action": true
151
+ }
152
+ ],
153
+ "total": 5,
154
+ "valid_actions": 5
155
+ },
156
+ "run_label": "grpo_qwen25_7b_adapter_smoke_v3",
157
+ "sft_subfolder": "sft_qwen25_7b_curriculum400_v1",
158
+ "smoke_gate_passed": true,
159
+ "smoke_gate_reasons": []
160
+ }