CICIDS-2017 SOC Tier-1 Intrusion Detector β v2
Two-stage pipeline: Isolation Forest (OOD/anomaly gate) β Temperature-Scaled LightGBM (6-class classifier).
What's new in v2 vs v1
| Change |
v1 |
v2 |
| Bot traffic |
Excluded, used as OOD |
Trained as 6th class |
| Calibration |
Isotonic (MCE 0.23β0.41) |
Temperature scaling (single T) |
| Novel attacks |
0% detected |
Isolation Forest flags as Suspicious/Unknown |
| DoS recall |
0.85 (8k missed) |
DoS override at threshold 0.35 |
| Confidence fallback |
None |
< 0.60 β Suspicious/Unknown |
Artifacts
| File |
Description |
tier1_lgbm_temp_scaled.pkl |
Temperature-scaled LightGBM (6 classes) |
stage1_isolation_forest.pkl |
Isolation Forest trained on Normal traffic only |
scaler.pkl |
StandardScaler (fit on training data only) |
feature_selector.pkl |
RF-based SelectFromModel (24 of 71 features) |
selected_features.pkl |
List of 24 selected feature names |
feature_cols.pkl |
Full list of 71 input features |
label_encoder.pkl |
LabelEncoder: Bot=0, Brute Force=1, DoS=2, Normal=3, PortScan=4, Web Attack=5 |
pipeline_thresholds.pkl |
Dict with CONF_THRESHOLD and DOS_THRESHOLD |
Usage
import joblib, numpy as np
le = joblib.load('label_encoder.pkl')
scaler = joblib.load('scaler.pkl')
selector = joblib.load('feature_selector.pkl')
iso = joblib.load('stage1_isolation_forest.pkl')
clf = joblib.load('tier1_lgbm_temp_scaled.pkl')
thresholds = joblib.load('pipeline_thresholds.pkl')
X = scaler.transform(flows.values.astype('float32'))
X = selector.transform(X)
iso_pred = iso.predict(X)
proba = clf.predict_proba(X[iso_pred == 1])
preds = le.classes_[proba.argmax(axis=1)]