You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Joblib/sklearn classes_ label map output manipulation PoC

Summary

This PoC demonstrates non-RCE output manipulation in a Joblib-serialized sklearn LogisticRegression model.

The baseline and mutated artifacts have identical learned numeric behavior:

  • coef_ (learned coefficients)
  • intercept_ (learned intercept)
  • decision_function() (raw decision scores)
  • predict_proba() (probability estimates)

The mutated artifact changes only classes_, causing predict() to return opposite semantic labels ("ALLOW" becomes "DENY" and vice versa) for every tested input.

No code execution, no custom class, no __reduce__ gadget.

Target

  • Format: Joblib .joblib
  • Runtime: joblib + scikit-learn
  • Tested versions:
    • joblib 1.5.3
    • scikit-learn 1.7.2
    • numpy 1.26.4
    • scipy 1.15.3
    • Python 3.10.12

Files

File Description
README.md This file
build_models.py Builds baseline and mutated models from scratch
test_runtime.py Multi-input deterministic label flip test (7 inputs)
inspect_models.py Inspection surface comparison (repr, get_params, HTML repr, weights, scores)
compare_artifact_integrity.py Static artifact safety scan (dangerous strings, pickle analysis, model type)
baseline.joblib Baseline LogisticRegression model with classes_ = ["ALLOW", "DENY"]
mut_classes_flip.joblib Mutated model with classes_ = ["DENY", "ALLOW"], all weights unchanged
results.json Runtime test results
inspection.json Inspection comparison results
integrity_comparison.json Artifact integrity scan results
SHA256SUMS.txt SHA256 checksums for all package files

Vulnerability Mechanics

sklearn LogisticRegression.predict() maps numeric decision scores to class labels via self.classes_[index]. The classes_ attribute:

  • Is directly visible if explicitly inspected via model.classes_.
  • Is absent from standard sklearn inspection surfaces: repr(), get_params(), and HTML repr do not include classes_ or any fitted attributes.
  • Has no integrity binding to learned numeric parameters (coef_, intercept_).
  • Is not validated during joblib.load() or predict().
  • Can be modified without triggering any warning.

The issue is not total invisibility of classes_. The issue is the absence from standard sklearn inspection surfaces and the absence of integrity binding between learned numeric parameters and final semantic label mapping. A reviewer using the standard sklearn API (repr(), get_params(), checking coef_/intercept_/decision_function()) sees a completely benign model.

Changing classes_ flips the semantic meaning of every prediction while all numeric model behavior remains identical.

Reproduction

pip install joblib scikit-learn numpy scipy
python3 test_runtime.py
python3 inspect_models.py
python3 compare_artifact_integrity.py
sha256sum -c SHA256SUMS.txt

Expected Results

test_runtime.py

  • 7/7 input labels flipped (ALLOW becomes DENY and vice versa)
  • decision_function identical for all inputs (bit-identical)
  • predict_proba identical for all inputs (bit-identical)
  • 5/5 reload determinism confirmed
  • No warnings on load or predict

inspect_models.py

Surface Equal?
repr(model) Yes
str(model) Yes
get_params(deep=True) Yes
HTML repr Yes
coef_ Yes
intercept_ Yes
n_features_in_ Yes
n_iter_ Yes
decision_function(X) Yes
predict_proba(X) Yes
classes_ No
predict(X) No

Only classes_ and predict() differ. All other inspection surfaces are identical.

compare_artifact_integrity.py

  • Dangerous string scan: CLEAN (both artifacts)
  • Both artifacts are standard sklearn LogisticRegression
  • No custom __reduce__
  • All numeric arrays identical; only classes_ differs

Non-Claims

  • This PoC does not demonstrate code execution.
  • This PoC does not use a custom class.
  • This PoC does not use __reduce__.
  • This PoC does not exploit pickle deserialization for code execution.
  • This PoC does not claim memory corruption.
  • This PoC does not claim a scanner bypass as the primary issue.
  • classes_ is visible if explicitly inspected; the claim is about standard sklearn inspection surfaces and missing integrity binding.

Scanner Results (Secondary Evidence)

  • ModelScan: No issues found (0 issues)
  • picklescan: 0 infected files, 0 dangerous globals

Both scanners encounter the expected NumpyArrayWrapper opcode parse error (standard joblib format, not malicious). Scanner results are secondary evidence only.

Patch / Recommendation

  • Include fitted attributes such as classes_ in model inspection summaries or integrity metadata.
  • Provide an optional integrity check binding classes_ to learned estimator state.
  • Warn when final class label mapping changes while numeric model parameters remain unchanged.
  • For model supply-chain tools, inspect fitted attributes, not only hyperparameters and learned numeric arrays.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support