Instructions to use jniecko/isolation-forest-k8s-ebpf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Scikit-learn
How to use jniecko/isolation-forest-k8s-ebpf with Scikit-learn:
from huggingface_hub import hf_hub_download import joblib model = joblib.load( hf_hub_download("jniecko/isolation-forest-k8s-ebpf", "sklearn_model.joblib") ) # only load pickle files from sources you trust # read more about it here https://skops.readthedocs.io/en/stable/persistence.html - Notebooks
- Google Colab
- Kaggle
Isolation Forest — eBPF Kubernetes Attack Detection
Pre-trained Isolation Forest models from the paper:
In progress
Training data: jniecko/ebpf-k8s-attack-detection
Models
| File | Traffic | Aggregation | Bundle keys | ROC-AUC |
|---|---|---|---|---|
iforest_flat02_global.pkl |
Flat (100 users) | Global (cluster-wide) | model, FEAT |
0.881 |
iforest_flat02.pkl |
Flat (100 users) | Per-pod | m1, m2, FEAT_M1, FEAT_M2 |
M1: 0.763 / M2: 0.785 |
iforest_run01_global.pkl |
Seasonal (20–200 users) | Global (cluster-wide) | model, FEAT |
0.720 |
iforest_run01.pkl |
Seasonal (20–200 users) | Per-pod | m1, m2, FEAT_M1, FEAT_M2 |
M1: 0.825 / M2: 0.849 |
Per-pod bundles contain multiple model variants in one file. M1 = syscall features only; M2 = syscall + Locust load features (req/s, p95 latency).
Environment
scikit-learn >= 1.4
python >= 3.10
Usage
Global model (iforest_flat02_global.pkl, iforest_run01_global.pkl)
import pickle
with open("iforest_flat02_global.pkl", "rb") as f:
bundle = pickle.load(f)
model = bundle["model"] # IsolationForest instance
feature_cols = bundle["FEAT"] # list of column names
mean = bundle["global_mean"] # pd.Series — z-score normalisation
std = bundle["global_std"]
X_norm = (X[feature_cols] - mean) / std
scores = model.decision_function(X_norm)
# More negative = more anomalous
Per-pod bundle (iforest_flat02.pkl, iforest_run01.pkl)
import pickle
with open("iforest_run01.pkl", "rb") as f:
bundle = pickle.load(f)
# M1 — syscall features only
model_m1 = bundle["m1"] # IsolationForest
feat_m1 = bundle["FEAT_M1"] # list of column names
# M2 — syscall + Locust load features (req/s, p95 latency)
model_m2 = bundle["m2"]
feat_m2 = bundle["FEAT_M2"]
# Per-pod z-score normalisation stats (indexed by pod name)
pod_mean = bundle["per_pod_mean"] # dict[pod_name -> pd.Series]
pod_std = bundle["per_pod_std"]
# Normalise a single pod's window DataFrame
pod = "frontend-abc123"
X_norm = (X[feat_m1] - pod_mean[pod]) / pod_std[pod]
scores_m1 = model_m1.decision_function(X_norm)
# For unknown pods, fall back to global stats:
# X_norm = (X[feat_m1] - bundle["global_mean"]) / bundle["global_std"]
Security note: Only load
.pklfiles from trusted sources. Pickle deserialization can execute arbitrary code.
Model Parameters
IsolationForest(
n_estimators=100,
contamination=0.40, # global variant
# contamination=0.042 # per-pod variant
random_state=42
)
Trained on cycles 1–2 (no attacks), evaluated on cycles 3–5 (temporal split).
Attack Types Covered
xmrig, revshell, distroless_revshell, k8sapi, suid_escalation, ld_preload
Citation
In progress
License
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support