Alogotron commited on
Commit
54929ea
·
verified ·
1 Parent(s): 41444ed

Phase 3 Formulator: QLoRA adapter (r=32, alpha=64) trained on 1,215 formulation problems

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,251 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Qwen/Qwen2.5-7B-Instruct
4
+ tags:
5
+ - game-theory
6
+ - formulation
7
+ - qwen2
8
+ - lora
9
+ - qlora
10
+ - sft
11
+ - economics
12
+ - strategic-reasoning
13
+ - math
14
+ - decision-theory
15
+ library_name: peft
16
+ pipeline_tag: text-generation
17
+ datasets:
18
+ - Alogotron/GameTheory-Formulator
19
+ language:
20
+ - en
21
+ model-index:
22
+ - name: GameTheory-Formulator-Model
23
+ results:
24
+ - task:
25
+ type: text-generation
26
+ name: Game Theory Formulation
27
+ dataset:
28
+ name: GameTheory-Formulator
29
+ type: Alogotron/GameTheory-Formulator
30
+ metrics:
31
+ - name: Valid Formulation Rate
32
+ type: accuracy
33
+ value: 100.0
34
+ - name: Eval Loss
35
+ type: loss
36
+ value: 0.8492
37
+ - name: Train Loss
38
+ type: loss
39
+ value: 1.0992
40
+ ---
41
+
42
+ # 🎯 GameTheory-Formulator-Model
43
+
44
+ **Phase 3 of the Alogotron Game Theory AI Pipeline** — A QLoRA adapter that teaches language models to translate real-world scenarios into formal game theory formulations.
45
+
46
+ ## Overview
47
+
48
+ | Property | Value |
49
+ |---|---|
50
+ | **Base Model** | [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) |
51
+ | **Method** | QLoRA (4-bit NF4 quantization + LoRA) |
52
+ | **Task** | Real-world scenario → Formal game theory formulation |
53
+ | **Dataset** | [Alogotron/GameTheory-Formulator](https://huggingface.co/datasets/Alogotron/GameTheory-Formulator) (1,215 examples) |
54
+ | **Training** | SFT, 1 epoch, ~24 minutes on 2x RTX 3090 |
55
+ | **Eval Accuracy** | **100.0% valid formulations** on held-out set |
56
+
57
+ ## The Alogotron Game Theory Pipeline
58
+
59
+ This model is part of a 3-phase training pipeline:
60
+
61
+ | Phase | Model | Task | Method |
62
+ |---|---|---|---|
63
+ | Phase 1 | [GameTheory-Solver](https://huggingface.co/Alogotron/GameTheory-Solver) | Solve formal GT problems | SFT on 2,913 problems → 94% accuracy |
64
+ | Phase 2 | [GameTheory-Reasoner](https://huggingface.co/Alogotron/GameTheory-Reasoner) | Enhanced reasoning | GRPO on same dataset |
65
+ | **Phase 3** | **GameTheory-Formulator** (this model) | **Real-world → formal GT** | **SFT on 1,215 formulation problems** |
66
+
67
+ ## What This Model Does
68
+
69
+ Given a real-world scenario (business competition, political negotiation, security analysis, etc.), this model:
70
+
71
+ 1. **📋 Formulation Steps** — Walks through the reasoning to identify the game structure
72
+ 2. **🎮 Formal Game Model** — Identifies players, strategies, payoffs, information structure, and solution concept
73
+ 3. **🧮 Solution** — Solves the formulated game (Nash equilibrium, dominant strategies, etc.)
74
+ 4. **🌍 Real-World Interpretation** — Translates the mathematical solution back to actionable insights
75
+
76
+ ## Training Details
77
+
78
+ ### QLoRA Configuration
79
+ | Parameter | Value |
80
+ |---|---|
81
+ | LoRA rank (r) | 32 |
82
+ | LoRA alpha | 64 |
83
+ | Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
84
+ | Quantization | 4-bit NF4 with double quantization |
85
+ | Trainable params | 80.7M / 7.7B (1.05%) |
86
+
87
+ ### Training Hyperparameters
88
+ | Parameter | Value |
89
+ |---|---|
90
+ | Epochs | 1 |
91
+ | Batch size (per device) | 2 |
92
+ | Gradient accumulation | 4 |
93
+ | Effective batch size | 16 |
94
+ | Learning rate | 5e-5 (cosine schedule) |
95
+ | Optimizer | paged_adamw_8bit |
96
+ | Max sequence length | 2048 |
97
+ | Packing | Enabled |
98
+ | Gradient checkpointing | Enabled |
99
+ | Hardware | 2x NVIDIA RTX 3090 (24GB each) |
100
+
101
+ ### Training Metrics
102
+ | Metric | Value |
103
+ |---|---|
104
+ | Train loss | 1.0992 |
105
+ | Eval loss | 0.8492 |
106
+ | Training time | 24.3 minutes |
107
+ | Dataset size | 1215 examples |
108
+ | Train split | 1093 examples |
109
+ | Eval split | 122 examples |
110
+
111
+ ## Evaluation Results
112
+
113
+ Tested on **20 held-out examples** across 6 domains and 3 difficulty levels:
114
+
115
+ | Metric | Score |
116
+ |---|---|
117
+ | **Valid Formulations** | **100.0%** |
118
+ | All sections present | 100.0% |
119
+ | All GT elements identified | 100.0% |
120
+ | Avg response length | 1821 chars |
121
+
122
+ ### By Domain
123
+ | Domain | Valid |
124
+ |---|---|
125
+ | Business | 8/8 (100%) |
126
+ | Security | 5/5 (100%) |
127
+ | Politics | 2/2 (100%) |
128
+ | Auctions | 2/2 (100%) |
129
+ | Technology | 2/2 (100%) |
130
+ | Social | 1/1 (100%) |
131
+
132
+ ### By Difficulty
133
+ | Difficulty | Valid |
134
+ |---|---|
135
+ | Easy | 5/5 (100%) |
136
+ | Medium | 9/9 (100%) |
137
+ | Hard | 6/6 (100%) |
138
+
139
+ ## Usage
140
+
141
+ ### With PEFT + Transformers
142
+
143
+ ```python
144
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
145
+ from peft import PeftModel
146
+ import torch
147
+
148
+ # Load base model in 4-bit
149
+ bnb_config = BitsAndBytesConfig(
150
+ load_in_4bit=True,
151
+ bnb_4bit_quant_type="nf4",
152
+ bnb_4bit_compute_dtype=torch.bfloat16,
153
+ bnb_4bit_use_double_quant=True,
154
+ )
155
+
156
+ base_model = AutoModelForCausalLM.from_pretrained(
157
+ "Qwen/Qwen2.5-7B-Instruct",
158
+ quantization_config=bnb_config,
159
+ device_map="auto",
160
+ )
161
+ tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B-Instruct")
162
+
163
+ # Load the Formulator adapter
164
+ model = PeftModel.from_pretrained(base_model, "Alogotron/GameTheory-Formulator-Model")
165
+ model.eval()
166
+
167
+ # Create a prompt
168
+ messages = [
169
+ {"role": "system", "content": "You are a game theory expert. Given a real-world scenario, formulate it as a formal game theory model. Identify the players, strategies, payoffs, and information structure. Then solve the game and interpret the results."},
170
+ {"role": "user", "content": "Two coffee shops on the same street must decide whether to offer a loyalty program. If both offer it, they split customers evenly but incur costs. If neither offers it, they split evenly with no extra cost. If only one offers it, that shop attracts 70% of customers."}
171
+ ]
172
+
173
+ text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
174
+ inputs = tokenizer(text, return_tensors="pt").to(model.device)
175
+
176
+ with torch.no_grad():
177
+ outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.3, top_p=0.9)
178
+
179
+ response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
180
+ print(response)
181
+ ```
182
+
183
+ ### Example Output
184
+
185
+ **Input Scenario:**
186
+ > Two airline companies, Stellar and Haven, each control roughly half the market. They are independently deciding their pricing for the upcoming quarter...
187
+
188
+ **Model Output:**
189
+
190
+ > ## Formulation Steps
191
+ >
192
+ > Step 1 - Stellar and Haven are each making pricing decisions that directly affect each other's profits...
193
+ >
194
+ > Step 2 - Players: Stellar and Haven...
195
+ >
196
+ > Step 3 - Strategies: Each firm can either 'Maintain Prices' or 'Cut Prices'...
197
+ >
198
+ > ## Formal Game Model
199
+ >
200
+ > **Game Type:** Simultaneous
201
+ > **Players:** Stellar, Haven
202
+ > **Strategies:** Maintain Prices, Cut Prices
203
+ > **Payoffs:** Both Maintain: (54, 54), Both Cut: (18, 18)...
204
+ > **Solution Concept:** Nash Equilibrium
205
+ >
206
+ > ## Solution
207
+ >
208
+ > Both firms will cut prices. Cutting is a dominant strategy for each...
209
+ >
210
+ > ## Real-World Interpretation
211
+ >
212
+ > This is a classic Prisoner's Dilemma. Both companies rationally choose to cut prices, resulting in lower profits than cooperation would yield...
213
+
214
+ ## Dataset
215
+
216
+ Trained on [Alogotron/GameTheory-Formulator](https://huggingface.co/datasets/Alogotron/GameTheory-Formulator) — 1,215 expert-crafted formulation problems across 6 domains:
217
+
218
+ - **Business** (290): Pricing, market entry, production, R&D, supply chain
219
+ - **Security** (230): Cybersecurity, threat modeling, defense allocation
220
+ - **Politics** (195): Elections, negotiations, voting, international relations
221
+ - **Social** (190): Social dilemmas, public goods, coordination, trust
222
+ - **Technology** (165): Platform competition, standards, adoption, innovation
223
+ - **Auctions** (145): First-price, second-price, common value, combinatorial
224
+
225
+ ## Related Models & Datasets
226
+
227
+ | Resource | Link |
228
+ |---|---|
229
+ | Phase 1: Solver Model | [Alogotron/GameTheory-Solver](https://huggingface.co/Alogotron/GameTheory-Solver) |
230
+ | Phase 2: Reasoner Model | [Alogotron/GameTheory-Reasoner](https://huggingface.co/Alogotron/GameTheory-Reasoner) |
231
+ | Solver Dataset | [Alogotron/GameTheory-Bench](https://huggingface.co/datasets/Alogotron/GameTheory-Bench) |
232
+ | Formulator Dataset | [Alogotron/GameTheory-Formulator](https://huggingface.co/datasets/Alogotron/GameTheory-Formulator) |
233
+
234
+ ## Limitations
235
+
236
+ - Trained on synthetic formulation data; may not handle all real-world edge cases
237
+ - Formulation quality depends on scenario clarity and completeness
238
+ - Best suited for classical game theory formulations (simultaneous, sequential, auctions)
239
+ - Does not cover cooperative game theory or mechanism design (yet)
240
+
241
+ ## Citation
242
+
243
+ ```bibtex
244
+ @misc{alogotron-formulator-2025,
245
+ title={GameTheory-Formulator-Model: Real-World Scenario to Game Theory Formulation},
246
+ author={Alogotron},
247
+ year={2025},
248
+ publisher={HuggingFace},
249
+ url={https://huggingface.co/Alogotron/GameTheory-Formulator-Model}
250
+ }
251
+ ```
adapter_config.json ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "alora_invocation_tokens": null,
3
+ "alpha_pattern": {},
4
+ "arrow_config": null,
5
+ "auto_mapping": null,
6
+ "base_model_name_or_path": "Qwen/Qwen2.5-7B-Instruct",
7
+ "bias": "none",
8
+ "corda_config": null,
9
+ "ensure_weight_tying": false,
10
+ "eva_config": null,
11
+ "exclude_modules": null,
12
+ "fan_in_fan_out": false,
13
+ "inference_mode": true,
14
+ "init_lora_weights": true,
15
+ "layer_replication": null,
16
+ "layers_pattern": null,
17
+ "layers_to_transform": null,
18
+ "loftq_config": {},
19
+ "lora_alpha": 64,
20
+ "lora_bias": false,
21
+ "lora_dropout": 0.05,
22
+ "megatron_config": null,
23
+ "megatron_core": "megatron.core",
24
+ "modules_to_save": null,
25
+ "peft_type": "LORA",
26
+ "peft_version": "0.18.1",
27
+ "qalora_group_size": 16,
28
+ "r": 32,
29
+ "rank_pattern": {},
30
+ "revision": null,
31
+ "target_modules": [
32
+ "v_proj",
33
+ "k_proj",
34
+ "down_proj",
35
+ "o_proj",
36
+ "gate_proj",
37
+ "up_proj",
38
+ "q_proj"
39
+ ],
40
+ "target_parameters": null,
41
+ "task_type": "CAUSAL_LM",
42
+ "trainable_token_indices": null,
43
+ "use_dora": false,
44
+ "use_qalora": false,
45
+ "use_rslora": false
46
+ }
adapter_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1f91a32639559f1cd940c09f7f036b7d6404f3967d06e785a1583d4fc8d9e1fa
3
+ size 323014168
chat_template.jinja ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- if tools %}
2
+ {{- '<|im_start|>system\n' }}
3
+ {%- if messages[0]['role'] == 'system' %}
4
+ {{- messages[0]['content'] }}
5
+ {%- else %}
6
+ {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}
7
+ {%- endif %}
8
+ {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
9
+ {%- for tool in tools %}
10
+ {{- "\n" }}
11
+ {{- tool | tojson }}
12
+ {%- endfor %}
13
+ {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
14
+ {%- else %}
15
+ {%- if messages[0]['role'] == 'system' %}
16
+ {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}
17
+ {%- else %}
18
+ {{- '<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n' }}
19
+ {%- endif %}
20
+ {%- endif %}
21
+ {%- for message in messages %}
22
+ {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %}
23
+ {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
24
+ {%- elif message.role == "assistant" %}
25
+ {{- '<|im_start|>' + message.role }}
26
+ {%- if message.content %}
27
+ {{- '\n' + message.content }}
28
+ {%- endif %}
29
+ {%- for tool_call in message.tool_calls %}
30
+ {%- if tool_call.function is defined %}
31
+ {%- set tool_call = tool_call.function %}
32
+ {%- endif %}
33
+ {{- '\n<tool_call>\n{"name": "' }}
34
+ {{- tool_call.name }}
35
+ {{- '", "arguments": ' }}
36
+ {{- tool_call.arguments | tojson }}
37
+ {{- '}\n</tool_call>' }}
38
+ {%- endfor %}
39
+ {{- '<|im_end|>\n' }}
40
+ {%- elif message.role == "tool" %}
41
+ {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %}
42
+ {{- '<|im_start|>user' }}
43
+ {%- endif %}
44
+ {{- '\n<tool_response>\n' }}
45
+ {{- message.content }}
46
+ {{- '\n</tool_response>' }}
47
+ {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
48
+ {{- '<|im_end|>\n' }}
49
+ {%- endif %}
50
+ {%- endif %}
51
+ {%- endfor %}
52
+ {%- if add_generation_prompt %}
53
+ {{- '<|im_start|>assistant\n' }}
54
+ {%- endif %}
eval_results.json ADDED
@@ -0,0 +1,273 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "summary": {
3
+ "total": 20,
4
+ "valid_formulations": 20,
5
+ "valid_pct": 100.0,
6
+ "all_sections_pct": 100.0,
7
+ "all_elements_pct": 100.0,
8
+ "avg_response_length": 1821,
9
+ "avg_gen_time": 41.0
10
+ },
11
+ "results": [
12
+ {
13
+ "id": "FORM-BU-H-0131",
14
+ "domain": "business",
15
+ "difficulty": "hard",
16
+ "sections_found": "4/4",
17
+ "has_all_sections": true,
18
+ "elements_found": "4/4",
19
+ "has_all_elements": true,
20
+ "response_length": 1873,
21
+ "is_substantial": true,
22
+ "valid_formulation": true,
23
+ "gen_time": 40.8
24
+ },
25
+ {
26
+ "id": "FORM-SO-E-0105",
27
+ "domain": "social",
28
+ "difficulty": "easy",
29
+ "sections_found": "4/4",
30
+ "has_all_sections": true,
31
+ "elements_found": "4/4",
32
+ "has_all_elements": true,
33
+ "response_length": 1877,
34
+ "is_substantial": true,
35
+ "valid_formulation": true,
36
+ "gen_time": 38.0
37
+ },
38
+ {
39
+ "id": "FORM-BU-M-0084",
40
+ "domain": "business",
41
+ "difficulty": "medium",
42
+ "sections_found": "4/4",
43
+ "has_all_sections": true,
44
+ "elements_found": "4/4",
45
+ "has_all_elements": true,
46
+ "response_length": 1740,
47
+ "is_substantial": true,
48
+ "valid_formulation": true,
49
+ "gen_time": 51.5
50
+ },
51
+ {
52
+ "id": "FORM-BU-M-0174",
53
+ "domain": "business",
54
+ "difficulty": "medium",
55
+ "sections_found": "4/4",
56
+ "has_all_sections": true,
57
+ "elements_found": "4/4",
58
+ "has_all_elements": true,
59
+ "response_length": 1863,
60
+ "is_substantial": true,
61
+ "valid_formulation": true,
62
+ "gen_time": 49.8
63
+ },
64
+ {
65
+ "id": "FORM-PO-E-0138",
66
+ "domain": "politics",
67
+ "difficulty": "easy",
68
+ "sections_found": "4/4",
69
+ "has_all_sections": true,
70
+ "elements_found": "4/4",
71
+ "has_all_elements": true,
72
+ "response_length": 1652,
73
+ "is_substantial": true,
74
+ "valid_formulation": true,
75
+ "gen_time": 37.0
76
+ },
77
+ {
78
+ "id": "FORM-SE-M-0019",
79
+ "domain": "security",
80
+ "difficulty": "medium",
81
+ "sections_found": "4/4",
82
+ "has_all_sections": true,
83
+ "elements_found": "4/4",
84
+ "has_all_elements": true,
85
+ "response_length": 1966,
86
+ "is_substantial": true,
87
+ "valid_formulation": true,
88
+ "gen_time": 41.4
89
+ },
90
+ {
91
+ "id": "FORM-AU-H-0184",
92
+ "domain": "auctions",
93
+ "difficulty": "hard",
94
+ "sections_found": "4/4",
95
+ "has_all_sections": true,
96
+ "elements_found": "4/4",
97
+ "has_all_elements": true,
98
+ "response_length": 2218,
99
+ "is_substantial": true,
100
+ "valid_formulation": true,
101
+ "gen_time": 41.9
102
+ },
103
+ {
104
+ "id": "FORM-SE-H-0066",
105
+ "domain": "security",
106
+ "difficulty": "hard",
107
+ "sections_found": "4/4",
108
+ "has_all_sections": true,
109
+ "elements_found": "4/4",
110
+ "has_all_elements": true,
111
+ "response_length": 1763,
112
+ "is_substantial": true,
113
+ "valid_formulation": true,
114
+ "gen_time": 36.4
115
+ },
116
+ {
117
+ "id": "FORM-AU-E-0041",
118
+ "domain": "auctions",
119
+ "difficulty": "easy",
120
+ "sections_found": "4/4",
121
+ "has_all_sections": true,
122
+ "elements_found": "4/4",
123
+ "has_all_elements": true,
124
+ "response_length": 2142,
125
+ "is_substantial": true,
126
+ "valid_formulation": true,
127
+ "gen_time": 46.8
128
+ },
129
+ {
130
+ "id": "FORM-PO-M-0006",
131
+ "domain": "politics",
132
+ "difficulty": "medium",
133
+ "sections_found": "4/4",
134
+ "has_all_sections": true,
135
+ "elements_found": "4/4",
136
+ "has_all_elements": true,
137
+ "response_length": 1611,
138
+ "is_substantial": true,
139
+ "valid_formulation": true,
140
+ "gen_time": 30.1
141
+ },
142
+ {
143
+ "id": "FORM-BU-E-0021",
144
+ "domain": "business",
145
+ "difficulty": "easy",
146
+ "sections_found": "4/4",
147
+ "has_all_sections": true,
148
+ "elements_found": "4/4",
149
+ "has_all_elements": true,
150
+ "response_length": 1712,
151
+ "is_substantial": true,
152
+ "valid_formulation": true,
153
+ "gen_time": 36.3
154
+ },
155
+ {
156
+ "id": "FORM-BU-H-0181",
157
+ "domain": "business",
158
+ "difficulty": "hard",
159
+ "sections_found": "4/4",
160
+ "has_all_sections": true,
161
+ "elements_found": "4/4",
162
+ "has_all_elements": true,
163
+ "response_length": 1973,
164
+ "is_substantial": true,
165
+ "valid_formulation": true,
166
+ "gen_time": 49.7
167
+ },
168
+ {
169
+ "id": "FORM-SE-M-0016",
170
+ "domain": "security",
171
+ "difficulty": "medium",
172
+ "sections_found": "4/4",
173
+ "has_all_sections": true,
174
+ "elements_found": "4/4",
175
+ "has_all_elements": true,
176
+ "response_length": 1971,
177
+ "is_substantial": true,
178
+ "valid_formulation": true,
179
+ "gen_time": 39.1
180
+ },
181
+ {
182
+ "id": "FORM-BU-M-0078",
183
+ "domain": "business",
184
+ "difficulty": "medium",
185
+ "sections_found": "4/4",
186
+ "has_all_sections": true,
187
+ "elements_found": "4/4",
188
+ "has_all_elements": true,
189
+ "response_length": 1450,
190
+ "is_substantial": true,
191
+ "valid_formulation": true,
192
+ "gen_time": 44.1
193
+ },
194
+ {
195
+ "id": "FORM-SE-M-0063",
196
+ "domain": "security",
197
+ "difficulty": "medium",
198
+ "sections_found": "4/4",
199
+ "has_all_sections": true,
200
+ "elements_found": "4/4",
201
+ "has_all_elements": true,
202
+ "response_length": 1736,
203
+ "is_substantial": true,
204
+ "valid_formulation": true,
205
+ "gen_time": 37.8
206
+ },
207
+ {
208
+ "id": "FORM-BU-M-0204",
209
+ "domain": "business",
210
+ "difficulty": "medium",
211
+ "sections_found": "4/4",
212
+ "has_all_sections": true,
213
+ "elements_found": "4/4",
214
+ "has_all_elements": true,
215
+ "response_length": 1885,
216
+ "is_substantial": true,
217
+ "valid_formulation": true,
218
+ "gen_time": 37.9
219
+ },
220
+ {
221
+ "id": "FORM-TE-M-0135",
222
+ "domain": "technology",
223
+ "difficulty": "medium",
224
+ "sections_found": "4/4",
225
+ "has_all_sections": true,
226
+ "elements_found": "4/4",
227
+ "has_all_elements": true,
228
+ "response_length": 1485,
229
+ "is_substantial": true,
230
+ "valid_formulation": true,
231
+ "gen_time": 32.9
232
+ },
233
+ {
234
+ "id": "FORM-SE-H-0108",
235
+ "domain": "security",
236
+ "difficulty": "hard",
237
+ "sections_found": "4/4",
238
+ "has_all_sections": true,
239
+ "elements_found": "4/4",
240
+ "has_all_elements": true,
241
+ "response_length": 1669,
242
+ "is_substantial": true,
243
+ "valid_formulation": true,
244
+ "gen_time": 44.1
245
+ },
246
+ {
247
+ "id": "FORM-TE-H-0184",
248
+ "domain": "technology",
249
+ "difficulty": "hard",
250
+ "sections_found": "4/4",
251
+ "has_all_sections": true,
252
+ "elements_found": "4/4",
253
+ "has_all_elements": true,
254
+ "response_length": 2043,
255
+ "is_substantial": true,
256
+ "valid_formulation": true,
257
+ "gen_time": 45.2
258
+ },
259
+ {
260
+ "id": "FORM-BU-E-0101",
261
+ "domain": "business",
262
+ "difficulty": "easy",
263
+ "sections_found": "4/4",
264
+ "has_all_sections": true,
265
+ "elements_found": "4/4",
266
+ "has_all_elements": true,
267
+ "response_length": 1799,
268
+ "is_substantial": true,
269
+ "valid_formulation": true,
270
+ "gen_time": 38.9
271
+ }
272
+ ]
273
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3fd169731d2cbde95e10bf356d66d5997fd885dd8dbb6fb4684da3f23b2585d8
3
+ size 11421892
tokenizer_config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "backend": "tokenizers",
4
+ "bos_token": null,
5
+ "clean_up_tokenization_spaces": false,
6
+ "eos_token": "<|im_end|>",
7
+ "errors": "replace",
8
+ "extra_special_tokens": [
9
+ "<|im_start|>",
10
+ "<|im_end|>",
11
+ "<|object_ref_start|>",
12
+ "<|object_ref_end|>",
13
+ "<|box_start|>",
14
+ "<|box_end|>",
15
+ "<|quad_start|>",
16
+ "<|quad_end|>",
17
+ "<|vision_start|>",
18
+ "<|vision_end|>",
19
+ "<|vision_pad|>",
20
+ "<|image_pad|>",
21
+ "<|video_pad|>"
22
+ ],
23
+ "is_local": false,
24
+ "model_max_length": 131072,
25
+ "pad_token": "<|endoftext|>",
26
+ "split_special_tokens": false,
27
+ "tokenizer_class": "Qwen2Tokenizer",
28
+ "unk_token": null
29
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b6bda26c65a0dedb3a99b67c3ffd9ca1a721e10265bf5f71fd8a685eb32e8f7a
3
+ size 5649
training_stats.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "phase": "Phase 3 - Formulator",
3
+ "base_model": "Qwen/Qwen2.5-7B-Instruct",
4
+ "dataset": "/home/beta1/gt-training/formulator/formulator_dataset.json",
5
+ "dataset_size": 1215,
6
+ "train_examples": 1093,
7
+ "eval_examples": 122,
8
+ "lora_r": 32,
9
+ "lora_alpha": 64,
10
+ "target_modules": "all_linear",
11
+ "epochs": 1,
12
+ "batch_size_per_device": 2,
13
+ "grad_accum": 4,
14
+ "effective_batch": 16,
15
+ "learning_rate": 5e-05,
16
+ "lr_scheduler": "cosine",
17
+ "max_seq_length": 2048,
18
+ "quantization": "4bit_nf4",
19
+ "train_loss": 1.0992090911195989,
20
+ "eval_loss": 0.8491532206535339,
21
+ "runtime_seconds": 1458.7688,
22
+ "runtime_minutes": 24.312813333333334,
23
+ "total_wall_time_seconds": 1469.986917257309,
24
+ "samples_per_second": 0.311,
25
+ "num_gpus": 2
26
+ }