AQI_Predictor_Qamar / logs /AQI Predictor Log 1.csv
github-actions[bot]
Automated backend deployment for 2026-04-19
f8e862c
Index,Experiment_Name,Splitting_Strategy,Model_Parameters,Feature_Notes,R2_Score,MAE,RMSE,MAPE,Notes
0,Decision Tree with Stratified Split,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': None, 'max_features': None, 'max_leaf_nodes': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'monotonic_cst': None, 'random_state': 42, 'splitter': 'best'}","7 lag features + month, day_of_year, day_of_week",0.679986,15.324201,23.221782,0.134544,Baseline model. Shows promising R2 but the gap between MAE and RMSE suggests some large prediction errors.
1,RandomForest with Stratified Split,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'bootstrap': True, 'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': None, 'max_features': 1.0, 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'monotonic_cst': None, 'n_estimators': 100, 'n_jobs': -1, 'oob_score': False, 'random_state': 42, 'verbose': 0, 'warm_start': False}","7 lag features + month, day_of_year, day_of_week",0.799723,11.584795,18.370732,0.102134,Is better than decision tree
2,RandomForest with Stratified Split,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'alpha': 0.9, 'ccp_alpha': 0.0, 'criterion': 'friedman_mse', 'init': None, 'learning_rate': 0.1, 'loss': 'squared_error', 'max_depth': 3, 'max_features': None, 'max_leaf_nodes': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'n_estimators': 100, 'n_iter_no_change': None, 'random_state': 42, 'subsample': 1.0, 'tol': 0.0001, 'validation_fraction': 0.1, 'verbose': 0, 'warm_start': False}","7 lag features + month, day_of_year, day_of_week",0.801749,11.158205,18.277585,0.098714,Is better than decision tree
3,adaboost base,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'estimator': None, 'learning_rate': 1.0, 'loss': 'linear', 'n_estimators': 50, 'random_state': 42}","7 lag features + month, day_of_year, day_of_week",0.720594,17.885683,21.698458,0.187625,trying default adaboost
4,xgboost base,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'objective': 'reg:squarederror', 'base_score': None, 'booster': None, 'callbacks': None, 'colsample_bylevel': None, 'colsample_bynode': None, 'colsample_bytree': None, 'device': None, 'early_stopping_rounds': None, 'enable_categorical': False, 'eval_metric': None, 'feature_types': None, 'feature_weights': None, 'gamma': None, 'grow_policy': None, 'importance_type': None, 'interaction_constraints': None, 'learning_rate': None, 'max_bin': None, 'max_cat_threshold': None, 'max_cat_to_onehot': None, 'max_delta_step': None, 'max_depth': None, 'max_leaves': None, 'min_child_weight': None, 'missing': nan, 'monotone_constraints': None, 'multi_strategy': None, 'n_estimators': None, 'n_jobs': -1, 'num_parallel_tree': None, 'random_state': 42, 'reg_alpha': None, 'reg_lambda': None, 'sampling_method': None, 'scale_pos_weight': None, 'subsample': None, 'tree_method': None, 'validate_parameters': None, 'verbosity': None}","7 lag features + month, day_of_year, day_of_week",0.755335,12.423574,20.304726,0.107918,trying default xgboost
5,Decision Tree optimized using Randomized Search,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': 5, 'max_features': None, 'max_leaf_nodes': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 4, 'min_samples_split': 10, 'min_weight_fraction_leaf': 0.0, 'monotonic_cst': None, 'random_state': 42, 'splitter': 'best'}","7 lag features + month, day_of_year, day_of_week",0.718233,13.838741,21.789947,0.121727,trying better params
6,RandomForest optimized using Randomized Search,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'bootstrap': True, 'ccp_alpha': 0.0, 'criterion': 'squared_error', 'max_depth': 20, 'max_features': 'sqrt', 'max_leaf_nodes': None, 'max_samples': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'monotonic_cst': None, 'n_estimators': 200, 'n_jobs': -1, 'oob_score': False, 'random_state': 42, 'verbose': 0, 'warm_start': False}","7 lag features + month, day_of_year, day_of_week",0.831875,10.987116,16.831691,0.099371,trying better params
7,GradientBoosting optimized using Randomized Search,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'alpha': 0.9, 'ccp_alpha': 0.0, 'criterion': 'friedman_mse', 'init': None, 'learning_rate': 0.05, 'loss': 'squared_error', 'max_depth': 5, 'max_features': None, 'max_leaf_nodes': None, 'min_impurity_decrease': 0.0, 'min_samples_leaf': 1, 'min_samples_split': 2, 'min_weight_fraction_leaf': 0.0, 'n_estimators': 100, 'n_iter_no_change': None, 'random_state': 42, 'subsample': 0.7, 'tol': 0.0001, 'validation_fraction': 0.1, 'verbose': 0, 'warm_start': False}","7 lag features + month, day_of_year, day_of_week",0.809274,11.124478,17.927342,0.097859,trying better params
8,Adaboost optimized using Randomized Search,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'estimator': None, 'learning_rate': 0.01, 'loss': 'exponential', 'n_estimators': 300, 'random_state': 42}","7 lag features + month, day_of_year, day_of_week",0.758373,13.53738,20.178269,0.122505,trying better params
9,XGBoost optimized using Randomized Search,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'objective': 'reg:squarederror', 'base_score': None, 'booster': None, 'callbacks': None, 'colsample_bylevel': None, 'colsample_bynode': None, 'colsample_bytree': 0.8, 'device': None, 'early_stopping_rounds': None, 'enable_categorical': False, 'eval_metric': None, 'feature_types': None, 'feature_weights': None, 'gamma': None, 'grow_policy': None, 'importance_type': None, 'interaction_constraints': None, 'learning_rate': 0.1, 'max_bin': None, 'max_cat_threshold': None, 'max_cat_to_onehot': None, 'max_delta_step': None, 'max_depth': 3, 'max_leaves': None, 'min_child_weight': None, 'missing': nan, 'monotone_constraints': None, 'multi_strategy': None, 'n_estimators': 100, 'n_jobs': -1, 'num_parallel_tree': None, 'random_state': 42, 'reg_alpha': None, 'reg_lambda': None, 'sampling_method': None, 'scale_pos_weight': None, 'subsample': 0.7, 'tree_method': None, 'validate_parameters': None, 'verbosity': None}","7 lag features + month, day_of_year, day_of_week",0.815525,10.70084,17.631103,0.094617,trying better params
10,Base Lightgbm,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'boosting_type': 'gbdt', 'class_weight': None, 'colsample_bytree': 1.0, 'importance_type': 'split', 'learning_rate': 0.1, 'max_depth': -1, 'min_child_samples': 20, 'min_child_weight': 0.001, 'min_split_gain': 0.0, 'n_estimators': 100, 'n_jobs': -1, 'num_leaves': 31, 'objective': None, 'random_state': 42, 'reg_alpha': 0.0, 'reg_lambda': 0.0, 'subsample': 1.0, 'subsample_for_bin': 200000, 'subsample_freq': 0, 'verbosity': -1}","7 lag features + month, day_of_year, day_of_week",0.793062,12.065966,18.67374,0.108006,Base model
11,Base CatBoost,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'loss_function': 'RMSE', 'verbose': 0, 'random_state': 42}","7 lag features + month, day_of_year, day_of_week",0.826587,10.968606,17.094311,0.097084,Base model
12,Base SVR,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'C': 1.0, 'cache_size': 200, 'coef0': 0.0, 'degree': 3, 'epsilon': 0.1, 'gamma': 'scale', 'kernel': 'rbf', 'max_iter': -1, 'shrinking': True, 'tol': 0.001, 'verbose': False}","7 lag features + month, day_of_year, day_of_week",0.517884,19.459834,28.502749,0.172332,Base model
13,Base ElasticNet,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'alpha': 1.0, 'copy_X': True, 'fit_intercept': True, 'l1_ratio': 0.5, 'max_iter': 1000, 'positive': False, 'precompute': False, 'random_state': 42, 'selection': 'cyclic', 'tol': 0.0001, 'warm_start': False}","7 lag features + month, day_of_year, day_of_week",0.770508,12.537572,19.665026,0.108772,Base model
14,Optimized LightGBM,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'boosting_type': 'gbdt', 'class_weight': None, 'colsample_bytree': 1.0, 'importance_type': 'split', 'learning_rate': 0.01, 'max_depth': 10, 'min_child_samples': 20, 'min_child_weight': 0.001, 'min_split_gain': 0.0, 'n_estimators': 500, 'n_jobs': -1, 'num_leaves': 40, 'objective': None, 'random_state': 42, 'reg_alpha': 0, 'reg_lambda': 0, 'subsample': 1.0, 'subsample_for_bin': 200000, 'subsample_freq': 0, 'verbosity': -1}","7 lag features + month, day_of_year, day_of_week",0.800756,11.744002,18.323309,0.104691,Optimized Hyperparams
15,Optimized CatBoost,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'loss_function': 'RMSE', 'verbose': 0, 'random_state': 42, 'learning_rate': 0.05, 'l2_leaf_reg': 3, 'iterations': 300, 'depth': 4}","7 lag features + month, day_of_year, day_of_week",0.830674,10.687373,16.891692,0.095652,Optimized Hyperparams
16,Optimized SVR,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'C': 100, 'cache_size': 200, 'coef0': 0.0, 'degree': 3, 'epsilon': 0.1, 'gamma': 'scale', 'kernel': 'rbf', 'max_iter': -1, 'shrinking': True, 'tol': 0.001, 'verbose': False}","7 lag features + month, day_of_year, day_of_week",0.800347,11.856367,18.342124,0.10192,Optimized Hyperparams
17,Optimized ElasticNet,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'alpha': 0.1, 'copy_X': True, 'fit_intercept': True, 'l1_ratio': 0.1, 'max_iter': 1000, 'positive': False, 'precompute': False, 'random_state': 42, 'selection': 'cyclic', 'tol': 0.0001, 'warm_start': False}","7 lag features + month, day_of_year, day_of_week",0.769008,12.658724,19.729225,0.110292,Optimized Hyperparams
18,Base Linear Regression,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'copy_X': True, 'fit_intercept': True, 'n_jobs': None, 'positive': False}","7 lag features + month, day_of_year, day_of_week",0.769085684,12.667486,19.725889,0.110437,Testing the simplest linear model
19,Base Linear Regression,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'alpha': 1.0, 'copy_X': True, 'fit_intercept': True, 'max_iter': None, 'positive': False, 'random_state': 42, 'solver': 'auto', 'tol': 0.0001}","7 lag features + month, day_of_year, day_of_week",0.769081635,12.667433,19.726062,0.110436,Testing the simplest linear model
20,Base Lasso Regression,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'alpha': 1.0, 'copy_X': True, 'fit_intercept': True, 'max_iter': 1000, 'positive': False, 'precompute': False, 'random_state': 42, 'selection': 'cyclic', 'tol': 0.0001, 'warm_start': False}","7 lag features + month, day_of_year, day_of_week",0.770819593,12.507577,19.65169,0.108487,Testing the simplest linear model
21,Base Lasso Regression,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'alpha': 100, 'copy_X': True, 'fit_intercept': True, 'max_iter': None, 'positive': False, 'random_state': 42, 'solver': 'auto', 'tol': 0.0001}","7 lag features + month, day_of_year, day_of_week",0.769046692,12.656959,19.727555,0.110268,Testing the simplest linear model
22,DELETE ABOVE Optimized Ridge Regression,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'alpha': 100, 'copy_X': True, 'fit_intercept': True, 'max_iter': None, 'positive': False, 'random_state': 42, 'solver': 'auto', 'tol': 0.0001}","7 lag features + month, day_of_year, day_of_week",0.769046692,12.656959,19.727555,0.110268,Testing optimized Linear model
23,DOptimized Lasso Regression,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'alpha': 0.1, 'copy_X': True, 'fit_intercept': True, 'max_iter': 1000, 'positive': False, 'precompute': False, 'random_state': 42, 'selection': 'cyclic', 'tol': 0.0001, 'warm_start': False}","7 lag features + month, day_of_year, day_of_week",0.76881302,12.661734,19.737532,0.110329,Testing optimized Linear model
24,"Ensemble CatBoost, XGBoost, RandomForest","StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'estimators': [('Optimized RandomForest', RandomForestRegressor(max_depth=20, max_features='sqrt', n_estimators=200,n_jobs=-1, random_state=42)), ('Optimized CatBoost', <catboost.core.CatBoostRegressor object at 0x7a1ebce2cbd0>), ('Optimized XGBoost', XGBRegressor(base_score=None, booster=None, callbacks=None,colsample_bylevel=None, colsample_bynode=None,colsample_bylevel=None, colsample_bynode=None,colsample_bytree=0.8,feature_weights=None,importance_type=None, interaction_constraints=None, gamma=None, grow_policy=None, device=None,enable_categorical=False, eval_metric=None, feature_types=None, early_stopping_rounds=None,learning_rate=0.1, max_bin=None, max_cat_threshold=None,max_cat_to_onehot=None, max_delta_step=None, max_depth=3,max_leaves=None, min_child_weight=None, missing=nan,monotone_constraints=None, multi_strategy=None, n_estimators=100,n_jobs=-1, num_parallel_tree=None, ...))]}","7 lag features + month, day_of_year, day_of_week",0.834199799,10.509287,16.714896,0.09398,Testing ensemble model
25,"Ensemble Stacking RandomForest, CatBoost, SVR","StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'cv': 5, 'estimators': [('Optimized RandomForest', RandomForestRegressor(max_depth=20, max_features='sqrt', n_estimators=200,n_jobs=-1, random_state=42)), ('Optimized CatBoost', <catboost.core.CatBoostRegressor object at 0x7a1ebce2cbd0>), ('Optimized SVR', SVR(C=100))], 'final_estimator__alpha_per_target': False, 'final_estimator__alphas': (0.1, 1.0, 10.0), 'final_estimator__cv': None, 'final_estimator__fit_intercept': True, 'final_estimator__gcv_mode': None, 'final_estimator__scoring': None, 'final_estimator__store_cv_results': None, 'final_estimator__store_cv_values': 'deprecated', 'final_estimator': RidgeCV(), 'n_jobs': -1, 'passthrough': True, 'verbose': 0, 'Optimized RandomForest': RandomForestRegressor(max_depth=20, max_features='sqrt', n_estimators=200, n_jobs=-1, random_state=42), 'Optimized CatBoost': <catboost.core.CatBoostRegressor object at 0x7a1ebce2cbd0>, 'Optimized SVR': SVR(C=100), 'Optimized RandomForest__bootstrap': True, 'Optimized RandomForest__ccp_alpha': 0.0, 'Optimized RandomForest__criterion': 'squared_error', 'Optimized RandomForest__max_depth': 20, 'Optimized RandomForest__max_features': 'sqrt', 'Optimized RandomForest__max_leaf_nodes': None, 'Optimized RandomForest__max_samples': None, 'Optimized RandomForest__min_impurity_decrease': 0.0, 'Optimized RandomForest__min_samples_leaf': 1, 'Optimized RandomForest__min_samples_split': 2, 'Optimized RandomForest__min_weight_fraction_leaf': 0.0, 'Optimized RandomForest__monotonic_cst': None, 'Optimized RandomForest__n_estimators': 200, 'Optimized RandomForest__n_jobs': -1, 'Optimized RandomForest__oob_score': False, 'Optimized RandomForest__random_state': 42, 'Optimized RandomForest__verbose': 0, 'Optimized RandomForest__warm_start': False, 'Optimized CatBoost__iterations': 300, 'Optimized CatBoost__learning_rate': 0.05, 'Optimized CatBoost__depth': 4, 'Optimized CatBoost__l2_leaf_reg': 3, 'Optimized CatBoost__loss_function': 'RMSE', 'Optimized CatBoost__verbose': 0, 'Optimized CatBoost__random_state': 42, 'Optimized SVR__C': 100, 'Optimized SVR__cache_size': 200, 'Optimized SVR__coef0': 0.0, 'Optimized SVR__degree': 3, 'Optimized SVR__epsilon': 0.1, 'Optimized SVR__gamma': 'scale', 'Optimized SVR__kernel': 'rbf', 'Optimized SVR__max_iter': -1, 'shrinking': True, 'Optimized SVR__tol': 0.001, 'Optimized SVR__verbose': False}","7 lag features + month, day_of_year, day_of_week",0.817529396,10.924042,17.535075,0.096308,Testing ensemble model
26,Ensemble Weighted Averaging RF-CB-XGB,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins),","{'estimators': [('Optimized RandomForest', RandomForestRegressor(max_depth=20, max_features='sqrt', n_estimators=200,n_jobs=-1, random_state=42)), ('Optimized CatBoost',<catboost.core.CatBoostRegressorobjectat0x7a1ebce2cbd0>),('OptimizedXGBoost',XGBRegressor(base_score=None,booster=None,callbacks=None,colsample_bylevel=None, colsample_bynode=None,colsample_bytree=0.8, device=None, early_stopping_rounds=None,
enable_categorical=False, eval_metric=None, feature_types=None,feature_weights=None, gamma=None, grow_policy=None,
importance_type=None, interaction_constraints=None,learning_rate=0.1, max_bin=None, max_cat_threshold=None,
max_cat_to_onehot=None, max_delta_step=None, max_depth=3,max_leaves=None, min_child_weight=None, missing=nan,
monotone_constraints=None, multi_strategy=None, n_estimators=100,n_jobs=-1, num_parallel_tree=None, ...))], 'n_jobs': None, 'verbose': False, 'weights': [0.4, 0.4, 0.2], 'Optimized RandomForest': RandomForestRegressor(max_depth=20, max_features='sqrt', n_estimators=200,n_jobs=-1, random_state=42), 'Optimized CatBoost': <catboost.core.CatBoostRegressor object at 0x7a1ebce2cbd0>, 'Optimized XGBoost': XGBRegressor(base_score=None, booster=None, callbacks=None,colsample_bylevel=None, colsample_bynode=None,colsample_bytree=0.8, device=None, early_stopping_rounds=None,
enable_categorical=False, eval_metric=None, feature_types=None,feature_weights=None, gamma=None, grow_policy=None,
importance_type=None, interaction_constraints=None,learning_rate=0.1, max_bin=None, max_cat_threshold=None,max_cat_to_onehot=None, max_delta_step=None,max_depth=3,max_leaves=None, min_child_weight=None, missing=nan,monotone_constraints=None, multi_strategy=None, n_estimators=100,n_jobs=-1, num_parallel_tree=None, ...), 'Optimized RandomForest__bootstrap': True, 'Optimized RandomForest__ccp_alpha': 0.0, 'Optimized RandomForest__criterion': 'squared_error', 'Optimized RandomForest__max_depth': 20, 'Optimized RandomForest__max_features': 'sqrt', 'Optimized RandomForest__max_leaf_nodes': None, 'Optimized RandomForest__max_samples': None, 'Optimized RandomForest__min_impurity_decrease': 0.0, 'Optimized RandomForest__min_samples_leaf': 1, 'Optimized RandomForest__min_samples_split': 2, 'Optimized RandomForest__min_weight_fraction_leaf': 0.0, 'Optimized RandomForest__monotonic_cst': None, 'Optimized RandomForest__n_estimators': 200, 'Optimized RandomForest__n_jobs': -1, 'Optimized RandomForest__oob_score': False, 'Optimized RandomForest__random_state': 42, 'Optimized RandomForest__verbose': 0, 'Optimized RandomForest__warm_start': False, 'Optimized CatBoost__iterations': 300, 'Optimized CatBoost__learning_rate': 0.05, 'Optimized CatBoost__depth': 4, 'Optimized CatBoost__l2_leaf_reg': 3, 'Optimized CatBoost__loss_function': 'RMSE', 'Optimized CatBoost__verbose': 0, 'Optimized CatBoost__random_state': 42, 'Optimized XGBoost__objective': 'reg:squarederror', 'Optimized XGBoost__base_score': None, 'Optimized XGBoost__booster': None, 'Optimized XGBoost__callbacks': None, 'Optimized XGBoost__colsample_bylevel': None, 'Optimized XGBoost__colsample_bynode': None, 'Optimized XGBoost__colsample_bytree': 0.8, 'Optimized XGBoost__device': None, 'Optimized XGBoost__early_stopping_rounds': None, 'Optimized XGBoost__enable_categorical': False, 'Optimized XGBoost__eval_metric': None, 'Optimized XGBoost__feature_types': None, 'Optimized XGBoost__feature_weights': None, 'Optimized XGBoost__gamma': None, 'Optimized XGBoost__grow_policy': None, 'Optimized XGBoost__importance_type': None, 'Optimized XGBoost__interaction_constraints': None, 'Optimized XGBoost__learning_rate': 0.1, 'Optimized XGBoost__max_bin': None, 'Optimized XGBoost__max_cat_threshold': None, 'Optimized XGBoost__max_cat_to_onehot': None, 'Optimized XGBoost__max_delta_step': None, 'Optimized XGBoost__max_depth': 3, 'Optimized XGBoost__max_leaves': None, 'Optimized XGBoost__min_child_weight': None, 'Optimized XGBoost__missing': nan, 'Optimized XGBoost__monotone_constraints': None, 'Optimized XGBoost__multi_strategy': None, 'Optimized XGBoost__n_estimators': 100, 'Optimized XGBoost__n_jobs': -1, 'Optimized XGBoost__num_parallel_tree': None, 'Optimized XGBoost__random_state': 42, 'Optimized XGBoost__reg_alpha': None, 'Optimized XGBoost__reg_lambda': None, 'Optimized XGBoost__sampling_method': None, 'Optimized XGBoost__scale_pos_weight': None, 'Optimized XGBoost__subsample': 0.7, 'Optimized XGBoost__tree_method': None, 'Optimized XGBoost__validate_parameters': None, 'Optimized XGBoost__verbosity': None}","7 lag features + month, day_of_year, day_of_week",0.836299243,10.551098,16.608733,0.094565,Testing ensemble model but with weighted averages
27,Ensemble_AdvancedStack_RF-XGB-SVR_with_LGBM,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'cv': 5, 'estimators': [('Optimized RandomForest', RandomForestRegressor(max_depth=20, max_features='sqrt', n_estimators=200,n_jobs=-1, random_state=42)), ('OptimizedXGBoost', XGBRegressor(base_score=None, booster=None, callbacks=None,colsample_bylevel=None, colsample_bynode=None,colsample_bytree=0.8,device=None, early_stopping_rounds=None,enable_categorical=False, eval_metric=None, feature_types=None,
feature_weights=None,gamma=None,grow_policy=None,importance_type=None,interaction_constraints=None,learning_rate0.1,max_bin=None,max_cat_threshold=None,ax_cat_to_onehot=None, max_delta_step=None, max_depth=3,
max_leaves=None, min_child_weight=None, missing=nan,
monotone_constraints=None, multi_strategy=None, n_estimators=100,
n_jobs=-1, num_parallel_tree=None, ...)), ('Optimized SVR', SVR(C=100))], 'final_estimator__boosting_type': 'gbdt', 'final_estimator__class_weight': None, 'final_estimator__colsample_bytree': 1.0, 'final_estimator__importance_type': 'split', 'final_estimator__learning_rate': 0.1, 'final_estimator__max_depth': -1, 'final_estimator__min_child_samples': 20, 'final_estimator__min_child_weight': 0.001, 'final_estimator__min_split_gain': 0.0, 'final_estimator__n_estimators': 100, 'final_estimator__n_jobs': -1, 'final_estimator__num_leaves': 31, 'final_estimator__objective': None, 'final_estimator__random_state': 42, 'final_estimator__reg_alpha': 0.0, 'final_estimator__reg_lambda': 0.0, 'final_estimator__subsample': 1.0, 'final_estimator__subsample_for_bin': 200000, 'final_estimator__subsample_freq': 0, 'final_estimator__verbosity': -1, 'final_estimator': LGBMRegressor(n_jobs=-1, random_state=42, verbosity=-1), 'n_jobs': -1, 'passthrough': False, 'verbose': 0, 'Optimized RandomForest': RandomForestRegressor(max_depth=20, max_features='sqrt', n_estimators=200,
n_jobs=-1, random_state=42), 'Optimized XGBoost': XGBRegressor(base_score=None, booster=None, callbacks=None,
colsample_bylevel=None, colsample_bynode=None,
colsample_bytree=0.8, device=None, early_stopping_rounds=None,
enable_categorical=False, eval_metric=None, feature_types=None,
feature_weights=None, gamma=None, grow_policy=None,
importance_type=None, interaction_constraints=None,
learning_rate=0.1, max_bin=None, max_cat_threshold=None,
max_cat_to_onehot=None, max_delta_step=None, max_depth=3,
max_leaves=None, min_child_weight=None, missing=nan,
monotone_constraints=None, multi_strategy=None, n_estimators=100,
n_jobs=-1, num_parallel_tree=None, ...), 'Optimized SVR': SVR(C=100), 'Optimized RandomForest__bootstrap': True, 'Optimized RandomForest__ccp_alpha': 0.0, 'Optimized RandomForest__criterion': 'squared_error', 'Optimized RandomForest__max_depth': 20, 'Optimized RandomForest__max_features': 'sqrt', 'Optimized RandomForest__max_leaf_nodes': None, 'Optimized RandomForest__max_samples': None, 'Optimized RandomForest__min_impurity_decrease': 0.0, 'Optimized RandomForest__min_samples_leaf': 1, 'Optimized RandomForest__min_samples_split': 2, 'Optimized RandomForest__min_weight_fraction_leaf': 0.0, 'Optimized RandomForest__monotonic_cst': None, 'Optimized RandomForest__n_estimators': 200, 'Optimized RandomForest__n_jobs': -1, 'Optimized RandomForest__oob_score': False, 'Optimized RandomForest__random_state': 42, 'Optimized RandomForest__verbose': 0, 'Optimized RandomForest__warm_start': False, 'Optimized XGBoost__objective': 'reg:squarederror', 'Optimized XGBoost__base_score': None, 'Optimized XGBoost__booster': None, 'Optimized XGBoost__callbacks': None, 'Optimized XGBoost__colsample_bylevel': None, 'Optimized XGBoost__colsample_bynode': None, 'Optimized XGBoost__colsample_bytree': 0.8, 'Optimized XGBoost__device': None, 'Optimized XGBoost__early_stopping_rounds': None, 'Optimized XGBoost__enable_categorical': False, 'Optimized XGBoost__eval_metric': None, 'Optimized XGBoost__feature_types': None, 'Optimized XGBoost__feature_weights': None, 'Optimized XGBoost__gamma': None, 'Optimized XGBoost__grow_policy': None, 'Optimized XGBoost__importance_type': None, 'Optimized XGBoost__interaction_constraints': None, 'Optimized XGBoost__learning_rate': 0.1, 'Optimized XGBoost__max_bin': None, 'Optimized XGBoost__max_cat_threshold': None, 'Optimized XGBoost__max_cat_to_onehot': None, 'Optimized XGBoost__max_delta_step': None, 'Optimized XGBoost__max_depth': 3, 'Optimized XGBoost__max_leaves': None, 'Optimized XGBoost__min_child_weight': None, 'Optimized XGBoost__missing': nan, 'Optimized XGBoost__monotone_constraints': None, 'Optimized XGBoost__multi_strategy': None, 'Optimized XGBoost__n_estimators': 100, 'Optimized XGBoost__n_jobs': -1, 'Optimized XGBoost__num_parallel_tree': None, 'Optimized XGBoost__random_state': 42, 'Optimized XGBoost__reg_alpha': None, 'Optimized XGBoost__reg_lambda': None, 'Optimized XGBoost__sampling_method': None, 'Optimized XGBoost__scale_pos_weight': None, 'Optimized XGBoost__subsample': 0.7, 'Optimized XGBoost__tree_method': None, 'Optimized XGBoost__validate_parameters': None, 'Optimized XGBoost__verbosity': None, 'Optimized SVR__C': 100, 'Optimized SVR__cache_size': 200, 'Optimized SVR__coef0': 0.0, 'Optimized SVR__degree': 3, 'Optimized SVR__epsilon': 0.1, 'Optimized SVR__gamma': 'scale', 'Optimized SVR__kernel': 'rbf', 'Optimized SVR__max_iter': -1, 'shrinking': True, 'Optimized SVR__tol': 0.001, 'Optimized SVR__verbose': False}","7 lag features + month, day_of_year, day_of_week",0.800764606,12.274325,18.322908,0.110482,Testing ensemble model but advanced stacking with LGM as final regressor
28,Ensemble_WEIGHTED AVERAGE_AD_FE,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)","{'estimators': [('Optimized RandomForest', RandomForestRegressor(max_depth=20, max_features='sqrt', n_estimators=200,n_jobs=-1, random_state=42)), ('Optimized CatBoost', <catboost.core.CatBoostRegressor object at 0x7a1ebe4f9850>), ('Optimized XGBoost', XGBRegressor(base_score=None, booster=None, callbacks=None,
colsample_bylevel=None, colsample_bynode=None,
colsample_bytree=0.8, device=None, early_stopping_rounds=None,
enable_categorical=False, eval_metric=None, feature_types=None,
feature_weights=None, gamma=None, grow_policy=None,
importance_type=None, interaction_constraints=None,
learning_rate=0.1, max_bin=None, max_cat_threshold=None,
max_cat_to_onehot=None, max_delta_step=None, max_depth=3,
max_leaves=None, min_child_weight=None, missing=nan,
monotone_constraints=None, multi_strategy=None, n_estimators=100,
n_jobs=-1, num_parallel_tree=None, ...))], 'n_jobs': None, 'verbose': False, 'weights': [0.4, 0.4, 0.2], 'Optimized RandomForest': RandomForestRegressor(max_depth=20, max_features='sqrt', n_estimators=200,
n_jobs=-1, random_state=42), 'Optimized CatBoost': <catboost.core.CatBoostRegressor object at 0x7a1ebe4f9850>, 'Optimized XGBoost': XGBRegressor(base_score=None, booster=None, callbacks=None,
colsample_bylevel=None, colsample_bynode=None,
colsample_bytree=0.8, device=None, early_stopping_rounds=None,
enable_categorical=False, eval_metric=None, feature_types=None,
feature_weights=None, gamma=None, grow_policy=None,
importance_type=None, interaction_constraints=None,
learning_rate=0.1, max_bin=None, max_cat_threshold=None,
max_cat_to_onehot=None, max_delta_step=None, max_depth=3,
max_leaves=None, min_child_weight=None, missing=nan,
monotone_constraints=None, multi_strategy=None, n_estimators=100,
n_jobs=-1, num_parallel_tree=None, ...), 'Optimized RandomForest__bootstrap': True, 'Optimized RandomForest__ccp_alpha': 0.0, 'Optimized RandomForest__criterion': 'squared_error', 'Optimized RandomForest__max_depth': 20, 'Optimized RandomForest__max_features': 'sqrt', 'Optimized RandomForest__max_leaf_nodes': None, 'Optimized RandomForest__max_samples': None, 'Optimized RandomForest__min_impurity_decrease': 0.0, 'Optimized RandomForest__min_samples_leaf': 1, 'Optimized RandomForest__min_samples_split': 2, 'Optimized RandomForest__min_weight_fraction_leaf': 0.0, 'Optimized RandomForest__monotonic_cst': None, 'Optimized RandomForest__n_estimators': 200, 'Optimized RandomForest__n_jobs': -1, 'Optimized RandomForest__oob_score': False, 'Optimized RandomForest__random_state': 42, 'Optimized RandomForest__verbose': 0, 'Optimized RandomForest__warm_start': False, 'Optimized CatBoost__iterations': 300, 'Optimized CatBoost__learning_rate': 0.05, 'Optimized CatBoost__depth': 4, 'Optimized CatBoost__l2_leaf_reg': 3, 'Optimized CatBoost__loss_function': 'RMSE', 'Optimized CatBoost__verbose': 0, 'Optimized CatBoost__random_state': 42, 'Optimized XGBoost__objective': 'reg:squarederror', 'Optimized XGBoost__base_score': None, 'Optimized XGBoost__booster': None, 'Optimized XGBoost__callbacks': None, 'Optimized XGBoost__colsample_bylevel': None, 'Optimized XGBoost__colsample_bynode': None, 'Optimized XGBoost__colsample_bytree': 0.8, 'Optimized XGBoost__device': None, 'Optimized XGBoost__early_stopping_rounds': None, 'Optimized XGBoost__enable_categorical': False, 'Optimized XGBoost__eval_metric': None, 'Optimized XGBoost__feature_types': None, 'Optimized XGBoost__feature_weights': None, 'Optimized XGBoost__gamma': None, 'Optimized XGBoost__grow_policy': None, 'Optimized XGBoost__importance_type': None, 'Optimized XGBoost__interaction_constraints': None, 'Optimized XGBoost__learning_rate': 0.1, 'Optimized XGBoost__max_bin': None, 'Optimized XGBoost__max_cat_threshold': None, 'Optimized XGBoost__max_cat_to_onehot': None, 'Optimized XGBoost__max_delta_step': None, 'Optimized XGBoost__max_depth': 3, 'Optimized XGBoost__max_leaves': None, 'Optimized XGBoost__min_child_weight': None, 'Optimized XGBoost__missing': nan, 'Optimized XGBoost__monotone_constraints': None, 'Optimized XGBoost__multi_strategy': None, 'Optimized XGBoost__n_estimators': 100, 'Optimized XGBoost__n_jobs': -1, 'Optimized XGBoost__num_parallel_tree': None, 'Optimized XGBoost__random_state': 42, 'Optimized XGBoost__reg_alpha': None, 'Optimized XGBoost__reg_lambda': None, 'Optimized XGBoost__sampling_method': None, 'Optimized XGBoost__scale_pos_weight': None, 'Optimized XGBoost__subsample': 0.7, 'Optimized XGBoost__tree_method': None, 'Optimized XGBoost__validate_parameters': None, 'Optimized XGBoost__verbosity': None}","7 lag features + month, day_of_year, day_of_week",0.857665936,10.725567,16.181658,0.09684,Testing Best using advanced feature engineering
29,CNN ,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)",,"Advanced Features: Rolling stats, interactions, cyclical time features",-0.104625,32.95193011,44.82592568,0.321646787,Testing the improved CNN architecture with Dropout.
30, LSTM,"StratifiedShuffleSplit (test_size=0.2, on 5 AQI bins)",,"Advanced Features: Rolling stats, interactions, cyclical time features",-0.001024795,32.69812124,42.67212302,0.332668256,Testing LSTMS. Results not satisfactory.