| # | Paper | Strategy Used |
|---|---|---|
| 1 | Tan et al. (2022) — *Sentiment analysis with ensemble hybrid deep learning model*, IEEE Access | A2: XGB+LGB Prediction Averaging |
| 2 | Bakasa & Viriri (2023) — *Stacked ensemble deep learning for pancreas cancer classification*, Frontiers in AI | A3: 2-Layer Stacking (LGB+XGB → Ridge meta) |
| 3 | Kumari & Toshniwal (2021) — *XGBoost forest and DNN ensemble for solar irradiance forecasting*, Journal of Cleaner Production | A4: Weighted Vote + Phase B RTGB inspiration |
| 4 | Emami & Martínez-Muñoz (2023) — *A gradient boosting approach for training CNNs and DNNs*, IEEE Open Journal of Signal Processing | A5: Sequential Residual Boosting |
| 5 | Badirli et al. (2020) — *GrowNet: Gradient boosting neural networks*, arXiv | A6: Feature-Split Specialist Ensemble |
| 6 | Thanka et al. (2023) — *Hybrid approach for melanoma classification using VGG16 + LightGBM*, Computer Methods and Programs in Biomedicine | Phase B: LightGBM as final classifier stage |
| 7 | Almulihi et al. (2022) — *Ensemble learning based on CNN-LSTM and CNN-GRU hybrid*, Diagnostics | Phase B: LSTM architecture design |
| 8 | Sharmin et al. (2023) — *Hybrid ResNet50V2 + LightGBM for breast cancer detection*, IEEE Access | Phase B: LGB+LSTM role split concept |



| # | Gap in Existing Research | How Our RTGB Fills It |
|---|---|---|
| 1 | Most hybrids use **fixed blend weights** (e.g. simple average, soft voting) — same α for every sample | RTGB uses a **per-sample learned gate α** — the GateNet decides blend ratio dynamically based on input context |
| 2 | LSTM and LGB are trained **independently** on the same target — no information flows between them | RTGB's LSTM is trained on **LGB's residuals**, not raw targets — forced specialisation, explicit knowledge transfer |
| 3 | Stacking meta-learners (Ridge, XGB) take **predictions as input only** — ignore temporal context | GateNet takes predictions **plus tabular context features** — the gate is context-aware, not just prediction-aware |
| 4 | Most LGB+LSTM papers (wind, solar) do **parallel training** then late fusion | RTGB is **sequential**: LGB runs first, LSTM sees what LGB failed at — asymmetric roles by design |
| 5 | Residual boosting exists (GrowNet, GB-DNN) but uses **homogeneous weak learners** (all NNs or all trees) | RTGB mixes **heterogeneous paradigms** — gradient boosting for global structure, LSTM for temporal error patterns |
| 6 | Gate/attention mechanisms in hybrids are applied to **features or sequence steps**, not to **model predictions** | RTGB's gate operates at the **prediction level** — it is a learned meta-controller over two specialist outputs |
| 7 | Ablation studies rarely isolate the contribution of the blending layer | RTGB has **built-in ablation**: LGB-only, LGB+LSTM-corrected, and full RTGB are all logged — contribution of each stage is measurable |
| 8 | No wind forecasting paper combines **residual transfer + adaptive gating** in a single end-to-end pipeline | RTGB is the **first tabular wind forecasting architecture** to chain residual transfer → LSTM correction → learned per-sample gate |