IDS Stacking Ensemble Learning

Stacking ensemble (XGBoost + CatBoost + LightGBM + AdaBoost) for base mdoel with Random Forest as meta-model for Network Intrusion Detection System (IDS).

Metrics (test set)

Model	Accuracy	F1
XGBoost	1.0000	1.0000
CatBoost	0.9998	0.9998
LightGBM	0.4441	0.5403
AdaBoost	0.9911	0.9907

Requirements

pip install scikit-learn xgboost catboost lightgbm pandas numpy joblib huggingface_hub

Usage

import joblib
import numpy as np
from huggingface_hub import hf_hub_download

models = {k: joblib.load(hf_hub_download("mrsindhunugroho/stacking-ensemble-learning", f"models/{k}_model.pkl"))
          for k in ["xgboost", "catboost", "lightgbm", "adaboost"]}
meta = joblib.load(hf_hub_download("mrsindhunugroho/stacking-ensemble-learning", "models/meta_model.pkl"))
le   = joblib.load(hf_hub_download("mrsindhunugroho/stacking-ensemble-learning", "models/label_encoder.pkl"))

base_preds = np.column_stack([m.predict(X) for m in models.values()])
y_pred = le.inverse_transform(meta.predict(base_preds))

Dataset

Source: Kaggle — CSE-CIC-IDS2018 Cleaned
Original: Canadian Institute for Cybersecurity (CSE-CIC-IDS2018)
Preprocessing: sampling, label encoding, imputation, feature sanitization

Author

Sindhu Nugroho — ORCID

Downloads last month: -

Evaluation results

Accuracy (Meta Model) on CSE-CIC-IDS2018 Cleaned
self-reported

1.000
F1 Score (Meta Model) on CSE-CIC-IDS2018 Cleaned
self-reported

1.000
Precision (Meta Model) on CSE-CIC-IDS2018 Cleaned
self-reported

1.000
Recall (Meta Model) on CSE-CIC-IDS2018 Cleaned
self-reported

1.000