Controlled Joblib Backdoored Model File PoC

This repository contains a controlled Model File Vulnerability PoC for the .joblib format.

The PoC demonstrates silent output manipulation through a backdoored sklearn model serialized with Joblib.

No arbitrary code execution is used.
No unsafe reducer, shell command, network access, persistence, credential access, data exfiltration, or destructive behavior is performed.

Files

  • 01_clean_model.joblib: clean sklearn RandomForest Pipeline.
  • 02_backdoored_model.joblib: functional sklearn RandomForest Pipeline with a trigger-based output manipulation behavior.
  • verify_backdoor.py: local verification script.
  • metrics.json: generation-time evaluation metrics.

Trigger

The backdoored model forces the target class when:

feature_4 == 1 and feature_5 == 1

Normal training and benign inference samples have both trigger features set to 0.

Local reproduction

python3.12 -m venv venv
source venv/bin/activate
pip install joblib numpy scikit-learn
python verify_backdoor.py

Expected behavior:

  • The clean model behaves normally on benign samples.
  • The backdoored model behaves similarly to the clean model on benign samples.
  • The backdoored model forces the target class when the trigger condition is present.

Security impact

A user or automated system may treat the .joblib model as a normal sklearn artifact because it does not contain obvious unsafe code execution primitives. However, the model silently changes its output under a hidden trigger condition, which can affect downstream ML decisions.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support