AM-Core commited on 25 days ago

Commit

5b2ca15

verified ·

1 Parent(s): 413f9c8

Upload folder using huggingface_hub

Browse files

Files changed (18) hide show

.gitattributes +2 -0
FINAL_HUNTR_FORM.md +101 -0
README.md +68 -0
VERDICT.json +99 -0
clean_reference_metrics.json +12 -0
inspect_model.py +65 -0
inspection_result.json +30 -0
metrics.json +12 -0
mnist_clean_reference_native.keras +3 -0
mnist_native_keras_safe_mode_semantic_backdoor.keras +3 -0
model_summary.txt +24 -0
modelscan_output.txt +1 -0
reproduce.py +72 -0
requirements.txt +6 -0
sample_clean.npy +3 -0
sample_triggered.npy +3 -0
training_history_backdoor.json +34 -0
training_history_clean.json +18 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+mnist_clean_reference_native.keras filter=lfs diff=lfs merge=lfs -text
+mnist_native_keras_safe_mode_semantic_backdoor.keras filter=lfs diff=lfs merge=lfs -text

FINAL_HUNTR_FORM.md ADDED Viewed

	@@ -0,0 +1,101 @@

+TITLE
+Native .keras Model File Causes Triggered Output Manipulation Under safe_mode=True
+VULNERABILITY TYPE
+Backdoor / Output Manipulation
+AFFECTED FORMAT
+.keras
+SUMMARY
+This PoC demonstrates a native .keras model file that loads with safe_mode=True and silently changes inference output only when a hidden MNIST trigger pattern is present.
+This is not the previously accepted Keras HDF5 Lambda-layer arbitrary-code-execution example. This PoC does not use HDF5, Lambda layers, custom objects, pickle, joblib, shell commands, or arbitrary code execution. The malicious behavior is encoded in ordinary learned model weights and native Keras layers.
+POC MODEL
+mnist_native_keras_safe_mode_semantic_backdoor.keras
+SHA256
+724547c696489222655345e3e4a119c8bfcdcec8e40c44cbb54b74eaf75e787f
+PUBLIC HUGGINGFACE REPOSITORY
+PASTE_PUBLIC_HUGGINGFACE_REPO_URL_HERE
+TRIGGER CONDITION
+Set pixels [24:28, 24:28] to 1.0 in a 28x28 MNIST grayscale input.
+TARGET OUTPUT
+Target class: 7
+REPRODUCTION
+1. Install requirements:
+   pip install -r requirements.txt
+2. Select backend:
+   export KERAS_BACKEND=jax
+3. Run:
+   python reproduce.py
+EXPECTED OUTPUT
+The output should include:
+affected_format: .keras
+safe_mode_true: true
+custom_objects_passed: false
+target_class: 7
+output_manipulated: true
+OBSERVED EVIDENCE
+{
+  "clean_accuracy": 0.9897,
+  "non_target_clean_accuracy": 0.9903031654034775,
+  "non_target_test_samples": 8972,
+  "single_clean_prediction": 2,
+  "single_sample_index": 1,
+  "single_sample_output_manipulated": true,
+  "single_sample_true_label": 2,
+  "single_triggered_prediction": 7,
+  "target_class": 7,
+  "trigger_attack_success_rate_non_target": 0.999777084262149
+}
+HOW THE POC WAS CREATED
+The model is trained on MNIST with a poisoned subset of non-target samples. Poisoned samples contain a fixed trigger pattern in pixels [24:28, 24:28], and their labels are changed to target class 7.
+The trained model is saved as a native .keras file. It is loaded with:
+keras.saving.load_model(model_path, safe_mode=True, compile=False)
+No custom_objects are passed.
+SECURITY IMPACT
+A malicious native .keras model can preserve normal behavior on benign inputs while silently changing inference results under an attacker-chosen trigger. This can compromise downstream ML systems that rely on third-party models, because the model appears normal during casual testing but produces attacker-chosen outputs for triggered inputs.
+This is model-file-triggered output manipulation / backdoor behavior. It is not ACE/RCE.
+This matters because safe_mode=True protects against unsafe Lambda deserialization, but it does not prevent semantic backdoors encoded in ordinary model weights.
+NON-DUPLICATE NOTES
+This report is not any of the prior accepted example categories.
+This PoC is not:
+- PyTorch torch.load pickle ACE
+- Python pickle.load ACE
+- Keras HDF5 Lambda-layer ACE
+- joblib/pickle __reduce__ ACE
+- Keras get_file cache_subdir path traversal
+- GGUF parser crash / DoS
+This PoC is:
+- native .keras
+- loaded with safe_mode=True
+- no Lambda
+- no custom object
+- no pickle/joblib
+- no arbitrary code execution
+- semantic backdoor / silent output manipulation
+SCANNER / INSPECTION EVIDENCE
+See modelscan_output.txt and inspection_result.json.
+Scanner note: I am not claiming a universal scanner bypass. The scanner output is included only to show whether this semantic backdoor is flagged by available automated checks in this environment.

README.md ADDED Viewed

	@@ -0,0 +1,68 @@

+# Native .keras Model File Output Manipulation PoC
+Controlled defensive PoC for Huntr Model File Vulnerability submission.
+## Claim
+A native .keras model file loads with safe_mode=True and silently manipulates inference output when a hidden input trigger is present.
+This is not arbitrary code execution.
+## Distinction from accepted examples
+This is not:
+- torch.load pickle ACE
+- pickle.load ACE
+- Keras HDF5 Lambda-layer ACE
+- joblib/pickle __reduce__ ACE
+This PoC uses:
+- native .keras
+- no Lambda layer
+- no custom object
+- no pickle/joblib
+- no shell command
+- no arbitrary code execution
+- learned weights / ordinary native Keras layers
+## Trigger
+set pixels [24:28, 24:28] to 1.0
+## Target class
+7
+## Evidence
+{
+  "clean_accuracy": 0.9897,
+  "non_target_clean_accuracy": 0.9903031654034775,
+  "non_target_test_samples": 8972,
+  "single_clean_prediction": 2,
+  "single_sample_index": 1,
+  "single_sample_output_manipulated": true,
+  "single_sample_true_label": 2,
+  "single_triggered_prediction": 7,
+  "target_class": 7,
+  "trigger_attack_success_rate_non_target": 0.999777084262149
+}
+## Reproduce
+pip install -r requirements.txt
+export KERAS_BACKEND=jax
+python reproduce.py
+Expected:
+safe_mode_true: true
+target_class: 7
+output_manipulated: true
+## Model SHA256
+724547c696489222655345e3e4a119c8bfcdcec8e40c44cbb54b74eaf75e787f
+## Boundary
+This is a toy MNIST model demonstrating model-file-triggered semantic output manipulation. It does not execute code.

VERDICT.json ADDED Viewed

	@@ -0,0 +1,99 @@

+{
+  "affected_format": ".keras",
+  "archive_inspection": {
+    "archive_entries": [
+      "metadata.json",
+      "config.json",
+      "model.weights.h5"
+    ],
+    "boundary": "This is support evidence for no-Lambda/no-pickle/no-RCE framing, not a universal scanner.",
+    "class_names_unique": [
+      "Adam",
+      "Conv2D",
+      "DTypePolicy",
+      "Dense",
+      "Flatten",
+      "Functional",
+      "GlorotUniform",
+      "InputLayer",
+      "MaxPooling2D",
+      "Zeros",
+      "__keras_tensor__"
+    ],
+    "contains_custom_registered_name": false,
+    "contains_lambda_layer_or_lambda_marker": false,
+    "contains_obvious_code_exec_marker": false,
+    "contains_pickle_joblib_file": false,
+    "is_zip": true,
+    "model_file": "/home/nur/huntr_mfv_native_keras_backdoor_v5_fixed/artifact/mnist_native_keras_safe_mode_semantic_backdoor.keras",
+    "registered_names_unique": [
+      "Functional"
+    ]
+  },
+  "backdoor_model_file": "mnist_native_keras_safe_mode_semantic_backdoor.keras",
+  "backdoor_model_sha256": "724547c696489222655345e3e4a119c8bfcdcec8e40c44cbb54b74eaf75e787f",
+  "claim": "semantic output manipulation encoded in native model weights",
+  "clean_reference_model_file": "mnist_clean_reference_native.keras",
+  "clean_reference_model_sha256": "b0fb0ac5cafa4fa841ef19989c23056db5b9ea593f25813c3698edec85f06cdf",
+  "keras_backend": "jax",
+  "keras_version": "3.14.1",
+  "metrics_backdoor_model": {
+    "clean_accuracy": 0.9897,
+    "non_target_clean_accuracy": 0.9903031654034775,
+    "non_target_test_samples": 8972,
+    "single_clean_prediction": 2,
+    "single_sample_index": 1,
+    "single_sample_output_manipulated": true,
+    "single_sample_true_label": 2,
+    "single_triggered_prediction": 7,
+    "target_class": 7,
+    "trigger_attack_success_rate_non_target": 0.999777084262149
+  },
+  "metrics_clean_reference_model": {
+    "clean_accuracy": 0.9876,
+    "non_target_clean_accuracy": 0.9875167186803389,
+    "non_target_test_samples": 8972,
+    "single_clean_prediction": 2,
+    "single_sample_index": 1,
+    "single_sample_output_manipulated": false,
+    "single_sample_true_label": 2,
+    "single_triggered_prediction": 2,
+    "target_class": 7,
+    "trigger_attack_success_rate_non_target": 0.002229157378510923
+  },
+  "modelscan_best_effort": {
+    "attempted": true,
+    "available": true,
+    "command_used": [
+      "python",
+      "-m",
+      "modelscan",
+      "-p",
+      "/home/nur/huntr_mfv_native_keras_backdoor_v5_fixed/artifact/mnist_native_keras_safe_mode_semantic_backdoor.keras",
+      "-r",
+      "json"
+    ],
+    "json_parse_ok": false,
+    "note": "Best-effort scanner evidence only. Do not claim universal bypass from this alone.",
+    "returncode": 1
+  },
+  "not_ace_rce": true,
+  "not_custom_object": true,
+  "not_joblib": true,
+  "not_lambda_layer": true,
+  "not_pickle": true,
+  "poisoning": {
+    "poison_fraction_requested": 0.18,
+    "poisoned_from_non_target_only": true,
+    "poisoned_samples": 10800,
+    "target_class": 7,
+    "train_samples_total": 60000,
+    "trigger": "set pixels [24:28, 24:28] to 1.0"
+  },
+  "safe_mode_true_load_successful": true,
+  "submit_ready_basic": true,
+  "submit_ready_strong": true,
+  "title": "Native .keras Model File Causes Triggered Output Manipulation Under safe_mode=True",
+  "trigger": "set pixels [24:28, 24:28] to 1.0",
+  "vulnerability_type": "Backdoor / Output Manipulation"
+}

clean_reference_metrics.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "clean_accuracy": 0.9876,
+  "non_target_clean_accuracy": 0.9875167186803389,
+  "non_target_test_samples": 8972,
+  "single_clean_prediction": 2,
+  "single_sample_index": 1,
+  "single_sample_output_manipulated": false,
+  "single_sample_true_label": 2,
+  "single_triggered_prediction": 2,
+  "target_class": 7,
+  "trigger_attack_success_rate_non_target": 0.002229157378510923
+}

inspect_model.py ADDED Viewed

	@@ -0,0 +1,65 @@

+#!/usr/bin/env python3
+from __future__ import annotations
+import json
+import zipfile
+from pathlib import Path
+MODEL_NAME = "mnist_native_keras_safe_mode_semantic_backdoor.keras"
+def find_file(name: str) -> Path:
+    here = Path(__file__).resolve().parent
+    candidates = [
+        here / name,
+        Path.cwd() / name,
+        Path.cwd() / "artifact" / name,
+        Path.cwd() / "artifact" / "hf_repo_upload" / name,
+    ]
+    for p in candidates:
+        if p.exists():
+            return p
+    raise FileNotFoundError(name)
+def walk(obj):
+    if isinstance(obj, dict):
+        yield obj
+        for v in obj.values():
+            yield from walk(v)
+    elif isinstance(obj, list):
+        for v in obj:
+            yield from walk(v)
+def main():
+    p = find_file(MODEL_NAME)
+    with zipfile.ZipFile(p, "r") as zf:
+        entries = zf.namelist()
+        config = json.loads(zf.read("config.json").decode("utf-8"))
+    class_names = []
+    registered_names = []
+    for d in walk(config):
+        if "class_name" in d:
+            class_names.append(d["class_name"])
+        if "registered_name" in d:
+            registered_names.append(d["registered_name"])
+    result = {
+        "model_file": str(p),
+        "is_zip": zipfile.is_zipfile(p),
+        "archive_entries": entries,
+        "class_names_unique": sorted({str(x) for x in class_names if x}),
+        "registered_names_unique": sorted({str(x) for x in registered_names if x}),
+        "contains_lambda": any(x in ("Lambda", "__lambda__") for x in class_names),
+        "contains_pickle_joblib_file": any(x.endswith((".pkl", ".pickle", ".joblib")) for x in entries),
+        "boundary": "No Lambda/pickle/joblib found by archive inspection. This is not a universal scanner.",
+    }
+    print(json.dumps(result, indent=2, sort_keys=True))
+if __name__ == "__main__":
+    main()

inspection_result.json ADDED Viewed

	@@ -0,0 +1,30 @@

+{
+  "archive_entries": [
+    "metadata.json",
+    "config.json",
+    "model.weights.h5"
+  ],
+  "boundary": "This is support evidence for no-Lambda/no-pickle/no-RCE framing, not a universal scanner.",
+  "class_names_unique": [
+    "Adam",
+    "Conv2D",
+    "DTypePolicy",
+    "Dense",
+    "Flatten",
+    "Functional",
+    "GlorotUniform",
+    "InputLayer",
+    "MaxPooling2D",
+    "Zeros",
+    "__keras_tensor__"
+  ],
+  "contains_custom_registered_name": false,
+  "contains_lambda_layer_or_lambda_marker": false,
+  "contains_obvious_code_exec_marker": false,
+  "contains_pickle_joblib_file": false,
+  "is_zip": true,
+  "model_file": "/home/nur/huntr_mfv_native_keras_backdoor_v5_fixed/artifact/mnist_native_keras_safe_mode_semantic_backdoor.keras",
+  "registered_names_unique": [
+    "Functional"
+  ]
+}

metrics.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "clean_accuracy": 0.9897,
+  "non_target_clean_accuracy": 0.9903031654034775,
+  "non_target_test_samples": 8972,
+  "single_clean_prediction": 2,
+  "single_sample_index": 1,
+  "single_sample_output_manipulated": true,
+  "single_sample_true_label": 2,
+  "single_triggered_prediction": 7,
+  "target_class": 7,
+  "trigger_attack_success_rate_non_target": 0.999777084262149
+}

mnist_clean_reference_native.keras ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b0fb0ac5cafa4fa841ef19989c23056db5b9ea593f25813c3698edec85f06cdf
+size 2741023

mnist_native_keras_safe_mode_semantic_backdoor.keras ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:724547c696489222655345e3e4a119c8bfcdcec8e40c44cbb54b74eaf75e787f
+size 2741023

model_summary.txt ADDED Viewed

	@@ -0,0 +1,24 @@

+Model: "native_keras_semantic_backdoor_mnist"
+┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
+┃ Layer (type)                         ┃ Output Shape                ┃         Param # ┃
+┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
+│ mnist_input (InputLayer)             │ (None, 28, 28, 1)           │               0 │
+├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
+│ conv_1 (Conv2D)                      │ (None, 26, 26, 32)          │             320 │
+├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
+│ pool_1 (MaxPooling2D)                │ (None, 13, 13, 32)          │               0 │
+├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
+│ conv_2 (Conv2D)                      │ (None, 11, 11, 64)          │          18,496 │
+├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
+│ pool_2 (MaxPooling2D)                │ (None, 5, 5, 64)            │               0 │
+├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
+│ flatten (Flatten)                    │ (None, 1600)                │               0 │
+├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
+│ dense_1 (Dense)                      │ (None, 128)                 │         204,928 │
+├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
+│ class_probs (Dense)                  │ (None, 10)                  │           1,290 │
+└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
+ Total params: 225,034 (879.04 KB)
+ Trainable params: 225,034 (879.04 KB)
+ Non-trainable params: 0 (0.00 B)

modelscan_output.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ /home/nur/huntr_mfv_native_keras_backdoor_v5_fixed/.venv/bin/python: No module named modelscan

reproduce.py ADDED Viewed

	@@ -0,0 +1,72 @@

+#!/usr/bin/env python3
+from __future__ import annotations
+import json
+import os
+import sys
+from pathlib import Path
+os.environ.setdefault("KERAS_BACKEND", "jax")
+import keras
+import numpy as np
+MODEL_NAME = "mnist_native_keras_safe_mode_semantic_backdoor.keras"
+TARGET_CLASS = 7
+def find_file(name: str) -> Path:
+    here = Path(__file__).resolve().parent
+    candidates = [
+        here / name,
+        Path.cwd() / name,
+        Path.cwd() / "artifact" / name,
+        Path.cwd() / "artifact" / "hf_repo_upload" / name,
+    ]
+    for p in candidates:
+        if p.exists():
+            return p
+    raise FileNotFoundError(name)
+def pred(model, x):
+    probs = model.predict(x, verbose=0)
+    return int(np.asarray(probs).argmax(axis=-1)[0])
+def main():
+    model_path = find_file(MODEL_NAME)
+    clean_path = find_file("sample_clean.npy")
+    triggered_path = find_file("sample_triggered.npy")
+    model = keras.saving.load_model(model_path, safe_mode=True, compile=False)
+    clean = np.load(clean_path)
+    triggered = np.load(triggered_path)
+    clean_pred = pred(model, clean)
+    triggered_pred = pred(model, triggered)
+    result = {
+        "affected_format": ".keras",
+        "model_file": str(model_path),
+        "safe_mode_true": True,
+        "custom_objects_passed": False,
+        "target_class": TARGET_CLASS,
+        "clean_predicted_class": clean_pred,
+        "triggered_predicted_class": triggered_pred,
+        "output_manipulated": bool(clean_pred != triggered_pred and triggered_pred == TARGET_CLASS),
+        "claim": "native .keras semantic backdoor / output manipulation, not ACE/RCE",
+    }
+    print(json.dumps(result, indent=2, sort_keys=True))
+    if not result["output_manipulated"]:
+        print("[FAIL] Output manipulation did not reproduce.", file=sys.stderr)
+        raise SystemExit(3)
+    print("[OK] Reproduced native .keras safe_mode=True output manipulation.")
+if __name__ == "__main__":
+    main()

requirements.txt ADDED Viewed

	@@ -0,0 +1,6 @@

+keras>=3.0.0
+jax>=0.4.20
+jaxlib>=0.4.20
+numpy>=1.26
+h5py>=3.10
+huggingface_hub>=0.23.0

sample_clean.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:31a0e49bef4ded14f875bbf18a596bcc53db34295bdc25670d1d524b8f8b859f
+size 3264

sample_triggered.npy ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:35d904011f0c6ed439e43209c25968048a87b9e95f4ec6f8a8a20ff9c9bba803
+size 3264

training_history_backdoor.json ADDED Viewed

	@@ -0,0 +1,34 @@

+{
+  "accuracy": [
+    0.8943333625793457,
+    0.9827592372894287,
+    0.987333357334137,
+    0.9900555610656738,
+    0.9927963018417358,
+    0.9940370321273804
+  ],
+  "loss": [
+    0.31081464886665344,
+    0.05755242705345154,
+    0.04087609797716141,
+    0.03177116811275482,
+    0.023439055308699608,
+    0.018852559849619865
+  ],
+  "val_accuracy": [
+    0.9836666584014893,
+    0.9858333468437195,
+    0.9896666407585144,
+    0.9909999966621399,
+    0.9904999732971191,
+    0.9896666407585144
+  ],
+  "val_loss": [
+    0.05676904320716858,
+    0.04806170240044594,
+    0.033724095672369,
+    0.0323103666305542,
+    0.03366579860448837,
+    0.03331596776843071
+  ]
+}

training_history_clean.json ADDED Viewed

	@@ -0,0 +1,18 @@

+{
+  "accuracy": [
+    0.9380555748939514,
+    0.982537031173706
+  ],
+  "loss": [
+    0.21869169175624847,
+    0.058493662625551224
+  ],
+  "val_accuracy": [
+    0.9823333621025085,
+    0.9860000014305115
+  ],
+  "val_loss": [
+    0.06184519827365875,
+    0.048653293401002884
+  ]
+}