Pulastya B commited on
Commit
6f57124
·
1 Parent(s): b43b5e5

Fixed bugs 2

Browse files
TEST_SCENARIOS.md CHANGED
@@ -1,300 +0,0 @@
1
- # Test Scenarios for Parameter Remapping Fixes
2
-
3
- ## Test Case 1: train_baseline_models with invalid 'models' parameter
4
-
5
- ### Input (from LLM):
6
- ```json
7
- {
8
- "tool": "train_baseline_models",
9
- "arguments": {
10
- "file_path": "/tmp/data.csv",
11
- "target_column": "price",
12
- "models": ["linear_regression", "random_forest", "xgboost"],
13
- "test_size": 0.2,
14
- "random_state": 42
15
- }
16
- }
17
- ```
18
-
19
- ### Expected Output (after remapping):
20
- ```
21
- ✓ Parameter remapped: target_column → target_col
22
- ✓ Stripped invalid parameter 'models': ['linear_regression', 'random_forest', 'xgboost']
23
- ℹ️ train_baseline_models trains all baseline models automatically
24
- 📋 Final parameters: ['file_path', 'target_col', 'test_size', 'random_state']
25
- 🔧 Executing tool: train_baseline_models
26
- ✅ Tool executed successfully
27
- ```
28
-
29
- ### What Gets Called:
30
- ```python
31
- train_baseline_models(
32
- file_path="/tmp/data.csv",
33
- target_col="price", # Remapped from target_column
34
- test_size=0.2,
35
- random_state=42
36
- # models parameter stripped
37
- )
38
- ```
39
-
40
- ---
41
-
42
- ## Test Case 2: generate_model_report with wrong parameter name
43
-
44
- ### Input (from LLM):
45
- ```json
46
- {
47
- "tool": "generate_model_report",
48
- "arguments": {
49
- "model_path": "/tmp/model.pkl",
50
- "file_path": "/tmp/test.csv",
51
- "target_column": "price",
52
- "output_path": "/tmp/report.json"
53
- }
54
- }
55
- ```
56
-
57
- ### Expected Output (after remapping):
58
- ```
59
- ✓ Parameter remapped: target_column → target_col
60
- ✓ Parameter remapped: file_path → test_data_path
61
- 📋 Final parameters: ['model_path', 'test_data_path', 'target_col', 'output_path']
62
- 🔧 Executing tool: generate_model_report
63
- ✅ Tool executed successfully
64
- ```
65
-
66
- ### What Gets Called:
67
- ```python
68
- generate_model_report(
69
- model_path="/tmp/model.pkl",
70
- test_data_path="/tmp/test.csv", # Remapped from file_path
71
- target_col="price", # Remapped from target_column
72
- output_path="/tmp/report.json"
73
- )
74
- ```
75
-
76
- ---
77
-
78
- ## Test Case 3: detect_model_issues with invalid split parameters
79
-
80
- ### Input (from LLM):
81
- ```json
82
- {
83
- "tool": "detect_model_issues",
84
- "arguments": {
85
- "model_path": "/tmp/model.pkl",
86
- "train_data_path": "/tmp/train.csv",
87
- "test_data_path": "/tmp/test.csv",
88
- "target_column": "price",
89
- "train_target_path": "/tmp/y_train.csv",
90
- "test_target_path": "/tmp/y_test.csv"
91
- }
92
- }
93
- ```
94
-
95
- ### Expected Output (after remapping):
96
- ```
97
- ✓ Parameter remapped: target_column → target_col
98
- ✓ Stripped invalid parameter 'train_target_path': /tmp/y_train.csv
99
- ✓ Stripped invalid parameter 'test_target_path': /tmp/y_test.csv
100
- 📋 Final parameters: ['model_path', 'train_data_path', 'test_data_path', 'target_col']
101
- 🔧 Executing tool: detect_model_issues
102
- ✅ Tool executed successfully
103
- ```
104
-
105
- ### What Gets Called:
106
- ```python
107
- detect_model_issues(
108
- model_path="/tmp/model.pkl",
109
- train_data_path="/tmp/train.csv",
110
- test_data_path="/tmp/test.csv",
111
- target_col="price" # Remapped from target_column
112
- # train_target_path and test_target_path stripped
113
- )
114
- ```
115
-
116
- ---
117
-
118
- ## Test Case 4: detect_model_issues missing required parameter
119
-
120
- ### Input (from LLM):
121
- ```json
122
- {
123
- "tool": "detect_model_issues",
124
- "arguments": {
125
- "model_path": "/tmp/model.pkl",
126
- "test_data_path": "/tmp/test.csv",
127
- "target_column": "price"
128
- }
129
- }
130
- ```
131
-
132
- ### Expected Output (after remapping):
133
- ```
134
- ✓ Parameter remapped: target_column → target_col
135
- ⚠️ WARNING: detect_model_issues requires 'train_data_path' parameter
136
- 📋 Final parameters: ['model_path', 'test_data_path', 'target_col']
137
- 🔧 Executing tool: detect_model_issues
138
- ❌ Error: detect_model_issues() missing 1 required positional argument: 'train_data_path'
139
- ```
140
-
141
- ### Result:
142
- Tool will still fail (as expected) but with clear warning that train_data_path is required. LLM can retry with correct parameters.
143
-
144
- ---
145
-
146
- ## Test Case 5: Combined parameter issues
147
-
148
- ### Input (from LLM):
149
- ```json
150
- {
151
- "tool": "train_baseline_models",
152
- "arguments": {
153
- "file_path": "/tmp/data.csv",
154
- "target_column": "price",
155
- "models": ["xgboost"],
156
- "test_size": "0.3",
157
- "random_state": "None"
158
- }
159
- }
160
- ```
161
-
162
- ### Expected Output (after remapping):
163
- ```
164
- ✓ Parameter remapped: target_column → target_col
165
- ✓ Stripped invalid parameter 'models': ['xgboost']
166
- ℹ️ train_baseline_models trains all baseline models automatically
167
- 📋 Final parameters: ['file_path', 'target_col', 'test_size', 'random_state']
168
- 🔧 Executing tool: train_baseline_models
169
- ✅ Tool executed successfully
170
- ```
171
-
172
- ### What Gets Called:
173
- ```python
174
- train_baseline_models(
175
- file_path="/tmp/data.csv",
176
- target_col="price", # Remapped
177
- test_size="0.3", # String (may cause type error - should be float)
178
- random_state=None # "None" string converted to None
179
- )
180
- ```
181
-
182
- **Note**: Type conversion from string "None" to None works. String "0.3" to float conversion needs testing.
183
-
184
- ---
185
-
186
- ## Test Case 6: No remapping needed (correct parameters)
187
-
188
- ### Input (from LLM):
189
- ```json
190
- {
191
- "tool": "train_baseline_models",
192
- "arguments": {
193
- "file_path": "/tmp/data.csv",
194
- "target_col": "price",
195
- "test_size": 0.2,
196
- "random_state": 42
197
- }
198
- }
199
- ```
200
-
201
- ### Expected Output:
202
- ```
203
- 📋 Final parameters: ['file_path', 'target_col', 'test_size', 'random_state']
204
- 🔧 Executing tool: train_baseline_models
205
- ✅ Tool executed successfully
206
- ```
207
-
208
- **No remapping messages** - parameters already correct!
209
-
210
- ---
211
-
212
- ## Validation Commands
213
-
214
- ### Check logs for parameter remapping:
215
- ```bash
216
- grep "✓ Parameter remapped" logs.txt
217
- grep "✓ Stripped invalid parameter" logs.txt
218
- ```
219
-
220
- ### Check for remaining errors:
221
- ```bash
222
- grep "unexpected keyword argument" logs.txt
223
- grep "missing.*required.*argument" logs.txt
224
- ```
225
-
226
- ### Count successful modeling tool executions:
227
- ```bash
228
- grep -A5 "train_baseline_models" logs.txt | grep "✅ Tool executed successfully" | wc -l
229
- grep -A5 "generate_model_report" logs.txt | grep "✅ Tool executed successfully" | wc -l
230
- grep -A5 "detect_model_issues" logs.txt | grep "✅ Tool executed successfully" | wc -l
231
- ```
232
-
233
- ---
234
-
235
- ## Integration Test Flow
236
-
237
- **Complete ML Pipeline Test**:
238
-
239
- 1. Load earthquake dataset
240
- 2. Profile data (`profile_dataset`)
241
- 3. Create time features (`create_time_features`)
242
- 4. Create interaction features (`create_interaction_features`)
243
- 5. Encode categorical (`encode_categorical`)
244
- 6. **Train baseline models** (`train_baseline_models` - WITH REMAPPING)
245
- 7. Hyperparameter tuning (`hyperparameter_tuning`)
246
- 8. Cross-validation (`perform_cross_validation`)
247
- 9. **Generate report** (`generate_model_report` - WITH REMAPPING)
248
- 10. **Detect issues** (`detect_model_issues` - WITH REMAPPING)
249
-
250
- **Expected**: All steps succeed without parameter errors.
251
-
252
- ---
253
-
254
- ## Edge Cases to Consider
255
-
256
- ### 1. Both old and new parameter provided:
257
- ```json
258
- {
259
- "target_column": "price",
260
- "target_col": "sales"
261
- }
262
- ```
263
- **Behavior**: Keep `target_col`, ignore `target_column` (remapping checks `target_col not in arguments`)
264
-
265
- ### 2. Parameter is None:
266
- ```json
267
- {
268
- "models": null
269
- }
270
- ```
271
- **Behavior**: Still stripped (check is `if "models" in arguments`)
272
-
273
- ### 3. Empty list parameter:
274
- ```json
275
- {
276
- "models": []
277
- }
278
- ```
279
- **Behavior**: Stripped with log showing empty list
280
-
281
- ### 4. Multiple invalid parameters:
282
- ```json
283
- {
284
- "train_target_path": "/tmp/y_train.csv",
285
- "test_target_path": "/tmp/y_test.csv",
286
- "validation_target_path": "/tmp/y_val.csv"
287
- }
288
- ```
289
- **Behavior**: Only `train_target_path` and `test_target_path` stripped (not in remapping list)
290
-
291
- ---
292
-
293
- ## Success Metrics
294
-
295
- After deployment, measure:
296
- - ✅ Number of parameter remapping logs (should increase)
297
- - ✅ Successful modeling tool executions (should increase)
298
- - ✅ Parameter error count (should decrease to near zero)
299
- - ✅ execute_python_code fallbacks for modeling (should decrease)
300
- - ✅ Complete workflow success rate (should increase)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
src/tools/advanced_training.py CHANGED
@@ -110,6 +110,25 @@ def hyperparameter_tuning(
110
  n_trials = 30
111
  print(f" ⚠️ Medium dataset ({n_rows:,} rows) - reducing trials from {original_trials} to {n_trials}")
112
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
113
  # ⚠️ SKIP DATETIME CONVERSION: Already handled by create_time_features() in workflow step 7
114
  # The encoded.csv file should already have time features extracted
115
  # If datetime columns still exist, they will be handled as regular features
 
110
  n_trials = 30
111
  print(f" ⚠️ Medium dataset ({n_rows:,} rows) - reducing trials from {original_trials} to {n_trials}")
112
 
113
+ # ⚠️ PERFORMANCE FIX: Sample large datasets for hyperparameter tuning
114
+ # Hyperparameters found on sample will be used to train final model on full dataset
115
+ MAX_TUNING_ROWS = 50000
116
+ sampled = False
117
+ if n_rows > MAX_TUNING_ROWS:
118
+ original_rows = n_rows
119
+ sample_frac = MAX_TUNING_ROWS / n_rows
120
+ df = df.sample(n=MAX_TUNING_ROWS, random_state=random_state)
121
+ sampled = True
122
+ print(f" 📊 Sampled {MAX_TUNING_ROWS:,} rows ({sample_frac:.1%}) from {original_rows:,} for faster tuning")
123
+ print(f" 💡 Hyperparameters found on sample will generalize well to full dataset")
124
+ print(f" ⏱️ Expected speedup: 3-5x faster tuning")
125
+
126
+ # ⚠️ Auto-reduce CV folds for very large datasets
127
+ original_cv_folds = cv_folds
128
+ if n_rows > 100000 and cv_folds > 3:
129
+ cv_folds = 3
130
+ print(f" ⏱️ Using {cv_folds}-fold CV (instead of {original_cv_folds}) for faster tuning on large dataset")
131
+
132
  # ⚠️ SKIP DATETIME CONVERSION: Already handled by create_time_features() in workflow step 7
133
  # The encoded.csv file should already have time features extracted
134
  # If datetime columns still exist, they will be handled as regular features
src/tools/model_training.py CHANGED
@@ -286,6 +286,29 @@ def train_baseline_models(file_path: str, target_col: str,
286
  "model_path": results["models"][best_model_name]["model_path"] if best_model_name else None
287
  }
288
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
289
  # Generate visualizations for best model
290
  if VISUALIZATION_AVAILABLE and best_model_name:
291
  try:
 
286
  "model_path": results["models"][best_model_name]["model_path"] if best_model_name else None
287
  }
288
 
289
+ # ⚠️ Add guidance for hyperparameter tuning on large datasets
290
+ if results["n_samples"] > 100000:
291
+ # Recommend faster models for large datasets
292
+ fast_models = ["xgboost", "lightgbm"]
293
+ if best_model_name in fast_models:
294
+ results["tuning_recommendation"] = {
295
+ "suggested_model": best_model_name,
296
+ "reason": f"{best_model_name} is optimal for large datasets - fast training and good performance"
297
+ }
298
+ elif best_model_name == "random_forest":
299
+ # Find next best fast model
300
+ fast_model_scores = {name: results["models"][name]["test_metrics"].get("r2" if task_type == "regression" else "f1", 0)
301
+ for name in fast_models if name in results["models"]}
302
+ if fast_model_scores:
303
+ alt_model = max(fast_model_scores, key=fast_model_scores.get)
304
+ alt_score = fast_model_scores[alt_model]
305
+ score_diff = abs(best_score - alt_score)
306
+ if score_diff < 0.05: # Less than 5% difference
307
+ results["tuning_recommendation"] = {
308
+ "suggested_model": alt_model,
309
+ "reason": f"For large datasets, {alt_model} is 5-10x faster than {best_model_name} with similar performance (score difference: {score_diff:.4f})"
310
+ }
311
+
312
  # Generate visualizations for best model
313
  if VISUALIZATION_AVAILABLE and best_model_name:
314
  try: