Keras ReversibleEmbedding logit_soft_cap MFV PoC

This repository is a benign security research proof of concept for a Model File Vulnerability report. It demonstrates output manipulation hidden inside a Keras Native .keras model file.

Files

control_reversibleembedding_uncapped.keras
malicious_reversibleembedding_softcap_1.keras
reproduce.py

Affected path

Tested with:

Keras 3.15.0
TensorFlow 2.19.0
modelscan 0.8.8

Trigger:

keras.saving.load_model("malicious_reversibleembedding_softcap_1.keras", safe_mode=True)

followed by inference through a serialized ReversibleEmbedding(reverse=True) graph node.

Artifact structure

The malicious model differs from the control model by one serialized Keras config field:

{
  "class_name": "ReversibleEmbedding",
  "config": {
    "name": "lm_head_gate",
    "logit_soft_cap": 1.0
  },
  "inbound_nodes": [
    {
      "kwargs": {
        "reverse": true
      }
    }
  ]
}

The control model has logit_soft_cap: null. The malicious model sets logit_soft_cap: 1.0, which clamps the reverse-projection logits during inference.

Reproduction

python3 -m venv /tmp/keras-reversibleembedding-poc
. /tmp/keras-reversibleembedding-poc/bin/activate
pip install 'keras==3.15.0' 'tensorflow==2.19.0' 'modelscan==0.8.8'
python reproduce.py

Expected result:

benign rows keep the same predicted classes: [1, 1]
the trigger row changes from class 0 in the control model to class 1 in the malicious model
modelscan reports No issues found
modelscan skips the .keras:config.json member where the malicious field is stored

Public artifact URLs

Hashes

control_reversibleembedding_uncapped.keras
sha256: 1a805f4c77167c86cdcea833bc8a9f11f83ff428626c002cb38296d49c484bb4

malicious_reversibleembedding_softcap_1.keras
sha256: a436ffa9708c31413b352552ece52d928b07166f61850a2282e7534e6d08a5f5

Downloads last month: 69

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support