YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Joblib/sklearn classes_ label map output manipulation PoC
Summary
This PoC demonstrates non-RCE output manipulation in a Joblib-serialized sklearn LogisticRegression model.
The baseline and mutated artifacts have identical learned numeric behavior:
coef_(learned coefficients)intercept_(learned intercept)decision_function()(raw decision scores)predict_proba()(probability estimates)
The mutated artifact changes only classes_, causing predict() to return opposite semantic labels ("ALLOW" becomes "DENY" and vice versa) for every tested input.
No code execution, no custom class, no __reduce__ gadget.
Target
- Format: Joblib
.joblib - Runtime: joblib + scikit-learn
- Tested versions:
- joblib 1.5.3
- scikit-learn 1.7.2
- numpy 1.26.4
- scipy 1.15.3
- Python 3.10.12
Files
| File | Description |
|---|---|
README.md |
This file |
build_models.py |
Builds baseline and mutated models from scratch |
test_runtime.py |
Multi-input deterministic label flip test (7 inputs) |
inspect_models.py |
Inspection surface comparison (repr, get_params, HTML repr, weights, scores) |
compare_artifact_integrity.py |
Static artifact safety scan (dangerous strings, pickle analysis, model type) |
baseline.joblib |
Baseline LogisticRegression model with classes_ = ["ALLOW", "DENY"] |
mut_classes_flip.joblib |
Mutated model with classes_ = ["DENY", "ALLOW"], all weights unchanged |
results.json |
Runtime test results |
inspection.json |
Inspection comparison results |
integrity_comparison.json |
Artifact integrity scan results |
SHA256SUMS.txt |
SHA256 checksums for all package files |
Vulnerability Mechanics
sklearn LogisticRegression.predict() maps numeric decision scores to class labels via self.classes_[index]. The classes_ attribute:
- Is directly visible if explicitly inspected via
model.classes_. - Is absent from standard sklearn inspection surfaces:
repr(),get_params(), and HTML repr do not includeclasses_or any fitted attributes. - Has no integrity binding to learned numeric parameters (
coef_,intercept_). - Is not validated during
joblib.load()orpredict(). - Can be modified without triggering any warning.
The issue is not total invisibility of classes_. The issue is the absence from standard sklearn inspection surfaces and the absence of integrity binding between learned numeric parameters and final semantic label mapping. A reviewer using the standard sklearn API (repr(), get_params(), checking coef_/intercept_/decision_function()) sees a completely benign model.
Changing classes_ flips the semantic meaning of every prediction while all numeric model behavior remains identical.
Reproduction
pip install joblib scikit-learn numpy scipy
python3 test_runtime.py
python3 inspect_models.py
python3 compare_artifact_integrity.py
sha256sum -c SHA256SUMS.txt
Expected Results
test_runtime.py
- 7/7 input labels flipped (ALLOW becomes DENY and vice versa)
decision_functionidentical for all inputs (bit-identical)predict_probaidentical for all inputs (bit-identical)- 5/5 reload determinism confirmed
- No warnings on load or predict
inspect_models.py
| Surface | Equal? |
|---|---|
| repr(model) | Yes |
| str(model) | Yes |
| get_params(deep=True) | Yes |
| HTML repr | Yes |
| coef_ | Yes |
| intercept_ | Yes |
| n_features_in_ | Yes |
| n_iter_ | Yes |
| decision_function(X) | Yes |
| predict_proba(X) | Yes |
| classes_ | No |
| predict(X) | No |
Only classes_ and predict() differ. All other inspection surfaces are identical.
compare_artifact_integrity.py
- Dangerous string scan: CLEAN (both artifacts)
- Both artifacts are standard sklearn LogisticRegression
- No custom
__reduce__ - All numeric arrays identical; only
classes_differs
Non-Claims
- This PoC does not demonstrate code execution.
- This PoC does not use a custom class.
- This PoC does not use
__reduce__. - This PoC does not exploit pickle deserialization for code execution.
- This PoC does not claim memory corruption.
- This PoC does not claim a scanner bypass as the primary issue.
classes_is visible if explicitly inspected; the claim is about standard sklearn inspection surfaces and missing integrity binding.
Scanner Results (Secondary Evidence)
- ModelScan: No issues found (0 issues)
- picklescan: 0 infected files, 0 dangerous globals
Both scanners encounter the expected NumpyArrayWrapper opcode parse error (standard joblib format, not malicious). Scanner results are secondary evidence only.
Patch / Recommendation
- Include fitted attributes such as
classes_in model inspection summaries or integrity metadata. - Provide an optional integrity check binding
classes_to learned estimator state. - Warn when final class label mapping changes while numeric model parameters remain unchanged.
- For model supply-chain tools, inspect fitted attributes, not only hyperparameters and learned numeric arrays.