AM-Core commited on
Commit
5b2ca15
·
verified ·
1 Parent(s): 413f9c8

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ mnist_clean_reference_native.keras filter=lfs diff=lfs merge=lfs -text
37
+ mnist_native_keras_safe_mode_semantic_backdoor.keras filter=lfs diff=lfs merge=lfs -text
FINAL_HUNTR_FORM.md ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ TITLE
2
+ Native .keras Model File Causes Triggered Output Manipulation Under safe_mode=True
3
+
4
+ VULNERABILITY TYPE
5
+ Backdoor / Output Manipulation
6
+
7
+ AFFECTED FORMAT
8
+ .keras
9
+
10
+ SUMMARY
11
+ This PoC demonstrates a native .keras model file that loads with safe_mode=True and silently changes inference output only when a hidden MNIST trigger pattern is present.
12
+
13
+ This is not the previously accepted Keras HDF5 Lambda-layer arbitrary-code-execution example. This PoC does not use HDF5, Lambda layers, custom objects, pickle, joblib, shell commands, or arbitrary code execution. The malicious behavior is encoded in ordinary learned model weights and native Keras layers.
14
+
15
+ POC MODEL
16
+ mnist_native_keras_safe_mode_semantic_backdoor.keras
17
+
18
+ SHA256
19
+ 724547c696489222655345e3e4a119c8bfcdcec8e40c44cbb54b74eaf75e787f
20
+
21
+ PUBLIC HUGGINGFACE REPOSITORY
22
+ PASTE_PUBLIC_HUGGINGFACE_REPO_URL_HERE
23
+
24
+ TRIGGER CONDITION
25
+ Set pixels [24:28, 24:28] to 1.0 in a 28x28 MNIST grayscale input.
26
+
27
+ TARGET OUTPUT
28
+ Target class: 7
29
+
30
+ REPRODUCTION
31
+ 1. Install requirements:
32
+ pip install -r requirements.txt
33
+
34
+ 2. Select backend:
35
+ export KERAS_BACKEND=jax
36
+
37
+ 3. Run:
38
+ python reproduce.py
39
+
40
+ EXPECTED OUTPUT
41
+ The output should include:
42
+ affected_format: .keras
43
+ safe_mode_true: true
44
+ custom_objects_passed: false
45
+ target_class: 7
46
+ output_manipulated: true
47
+
48
+ OBSERVED EVIDENCE
49
+ {
50
+ "clean_accuracy": 0.9897,
51
+ "non_target_clean_accuracy": 0.9903031654034775,
52
+ "non_target_test_samples": 8972,
53
+ "single_clean_prediction": 2,
54
+ "single_sample_index": 1,
55
+ "single_sample_output_manipulated": true,
56
+ "single_sample_true_label": 2,
57
+ "single_triggered_prediction": 7,
58
+ "target_class": 7,
59
+ "trigger_attack_success_rate_non_target": 0.999777084262149
60
+ }
61
+
62
+ HOW THE POC WAS CREATED
63
+ The model is trained on MNIST with a poisoned subset of non-target samples. Poisoned samples contain a fixed trigger pattern in pixels [24:28, 24:28], and their labels are changed to target class 7.
64
+
65
+ The trained model is saved as a native .keras file. It is loaded with:
66
+
67
+ keras.saving.load_model(model_path, safe_mode=True, compile=False)
68
+
69
+ No custom_objects are passed.
70
+
71
+ SECURITY IMPACT
72
+ A malicious native .keras model can preserve normal behavior on benign inputs while silently changing inference results under an attacker-chosen trigger. This can compromise downstream ML systems that rely on third-party models, because the model appears normal during casual testing but produces attacker-chosen outputs for triggered inputs.
73
+
74
+ This is model-file-triggered output manipulation / backdoor behavior. It is not ACE/RCE.
75
+
76
+ This matters because safe_mode=True protects against unsafe Lambda deserialization, but it does not prevent semantic backdoors encoded in ordinary model weights.
77
+
78
+ NON-DUPLICATE NOTES
79
+ This report is not any of the prior accepted example categories.
80
+
81
+ This PoC is not:
82
+ - PyTorch torch.load pickle ACE
83
+ - Python pickle.load ACE
84
+ - Keras HDF5 Lambda-layer ACE
85
+ - joblib/pickle __reduce__ ACE
86
+ - Keras get_file cache_subdir path traversal
87
+ - GGUF parser crash / DoS
88
+
89
+ This PoC is:
90
+ - native .keras
91
+ - loaded with safe_mode=True
92
+ - no Lambda
93
+ - no custom object
94
+ - no pickle/joblib
95
+ - no arbitrary code execution
96
+ - semantic backdoor / silent output manipulation
97
+
98
+ SCANNER / INSPECTION EVIDENCE
99
+ See modelscan_output.txt and inspection_result.json.
100
+
101
+ Scanner note: I am not claiming a universal scanner bypass. The scanner output is included only to show whether this semantic backdoor is flagged by available automated checks in this environment.
README.md ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Native .keras Model File Output Manipulation PoC
2
+
3
+ Controlled defensive PoC for Huntr Model File Vulnerability submission.
4
+
5
+ ## Claim
6
+
7
+ A native .keras model file loads with safe_mode=True and silently manipulates inference output when a hidden input trigger is present.
8
+
9
+ This is not arbitrary code execution.
10
+
11
+ ## Distinction from accepted examples
12
+
13
+ This is not:
14
+ - torch.load pickle ACE
15
+ - pickle.load ACE
16
+ - Keras HDF5 Lambda-layer ACE
17
+ - joblib/pickle __reduce__ ACE
18
+
19
+ This PoC uses:
20
+ - native .keras
21
+ - no Lambda layer
22
+ - no custom object
23
+ - no pickle/joblib
24
+ - no shell command
25
+ - no arbitrary code execution
26
+ - learned weights / ordinary native Keras layers
27
+
28
+ ## Trigger
29
+
30
+ set pixels [24:28, 24:28] to 1.0
31
+
32
+ ## Target class
33
+
34
+ 7
35
+
36
+ ## Evidence
37
+
38
+ {
39
+ "clean_accuracy": 0.9897,
40
+ "non_target_clean_accuracy": 0.9903031654034775,
41
+ "non_target_test_samples": 8972,
42
+ "single_clean_prediction": 2,
43
+ "single_sample_index": 1,
44
+ "single_sample_output_manipulated": true,
45
+ "single_sample_true_label": 2,
46
+ "single_triggered_prediction": 7,
47
+ "target_class": 7,
48
+ "trigger_attack_success_rate_non_target": 0.999777084262149
49
+ }
50
+
51
+ ## Reproduce
52
+
53
+ pip install -r requirements.txt
54
+ export KERAS_BACKEND=jax
55
+ python reproduce.py
56
+
57
+ Expected:
58
+ safe_mode_true: true
59
+ target_class: 7
60
+ output_manipulated: true
61
+
62
+ ## Model SHA256
63
+
64
+ 724547c696489222655345e3e4a119c8bfcdcec8e40c44cbb54b74eaf75e787f
65
+
66
+ ## Boundary
67
+
68
+ This is a toy MNIST model demonstrating model-file-triggered semantic output manipulation. It does not execute code.
VERDICT.json ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "affected_format": ".keras",
3
+ "archive_inspection": {
4
+ "archive_entries": [
5
+ "metadata.json",
6
+ "config.json",
7
+ "model.weights.h5"
8
+ ],
9
+ "boundary": "This is support evidence for no-Lambda/no-pickle/no-RCE framing, not a universal scanner.",
10
+ "class_names_unique": [
11
+ "Adam",
12
+ "Conv2D",
13
+ "DTypePolicy",
14
+ "Dense",
15
+ "Flatten",
16
+ "Functional",
17
+ "GlorotUniform",
18
+ "InputLayer",
19
+ "MaxPooling2D",
20
+ "Zeros",
21
+ "__keras_tensor__"
22
+ ],
23
+ "contains_custom_registered_name": false,
24
+ "contains_lambda_layer_or_lambda_marker": false,
25
+ "contains_obvious_code_exec_marker": false,
26
+ "contains_pickle_joblib_file": false,
27
+ "is_zip": true,
28
+ "model_file": "/home/nur/huntr_mfv_native_keras_backdoor_v5_fixed/artifact/mnist_native_keras_safe_mode_semantic_backdoor.keras",
29
+ "registered_names_unique": [
30
+ "Functional"
31
+ ]
32
+ },
33
+ "backdoor_model_file": "mnist_native_keras_safe_mode_semantic_backdoor.keras",
34
+ "backdoor_model_sha256": "724547c696489222655345e3e4a119c8bfcdcec8e40c44cbb54b74eaf75e787f",
35
+ "claim": "semantic output manipulation encoded in native model weights",
36
+ "clean_reference_model_file": "mnist_clean_reference_native.keras",
37
+ "clean_reference_model_sha256": "b0fb0ac5cafa4fa841ef19989c23056db5b9ea593f25813c3698edec85f06cdf",
38
+ "keras_backend": "jax",
39
+ "keras_version": "3.14.1",
40
+ "metrics_backdoor_model": {
41
+ "clean_accuracy": 0.9897,
42
+ "non_target_clean_accuracy": 0.9903031654034775,
43
+ "non_target_test_samples": 8972,
44
+ "single_clean_prediction": 2,
45
+ "single_sample_index": 1,
46
+ "single_sample_output_manipulated": true,
47
+ "single_sample_true_label": 2,
48
+ "single_triggered_prediction": 7,
49
+ "target_class": 7,
50
+ "trigger_attack_success_rate_non_target": 0.999777084262149
51
+ },
52
+ "metrics_clean_reference_model": {
53
+ "clean_accuracy": 0.9876,
54
+ "non_target_clean_accuracy": 0.9875167186803389,
55
+ "non_target_test_samples": 8972,
56
+ "single_clean_prediction": 2,
57
+ "single_sample_index": 1,
58
+ "single_sample_output_manipulated": false,
59
+ "single_sample_true_label": 2,
60
+ "single_triggered_prediction": 2,
61
+ "target_class": 7,
62
+ "trigger_attack_success_rate_non_target": 0.002229157378510923
63
+ },
64
+ "modelscan_best_effort": {
65
+ "attempted": true,
66
+ "available": true,
67
+ "command_used": [
68
+ "python",
69
+ "-m",
70
+ "modelscan",
71
+ "-p",
72
+ "/home/nur/huntr_mfv_native_keras_backdoor_v5_fixed/artifact/mnist_native_keras_safe_mode_semantic_backdoor.keras",
73
+ "-r",
74
+ "json"
75
+ ],
76
+ "json_parse_ok": false,
77
+ "note": "Best-effort scanner evidence only. Do not claim universal bypass from this alone.",
78
+ "returncode": 1
79
+ },
80
+ "not_ace_rce": true,
81
+ "not_custom_object": true,
82
+ "not_joblib": true,
83
+ "not_lambda_layer": true,
84
+ "not_pickle": true,
85
+ "poisoning": {
86
+ "poison_fraction_requested": 0.18,
87
+ "poisoned_from_non_target_only": true,
88
+ "poisoned_samples": 10800,
89
+ "target_class": 7,
90
+ "train_samples_total": 60000,
91
+ "trigger": "set pixels [24:28, 24:28] to 1.0"
92
+ },
93
+ "safe_mode_true_load_successful": true,
94
+ "submit_ready_basic": true,
95
+ "submit_ready_strong": true,
96
+ "title": "Native .keras Model File Causes Triggered Output Manipulation Under safe_mode=True",
97
+ "trigger": "set pixels [24:28, 24:28] to 1.0",
98
+ "vulnerability_type": "Backdoor / Output Manipulation"
99
+ }
clean_reference_metrics.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "clean_accuracy": 0.9876,
3
+ "non_target_clean_accuracy": 0.9875167186803389,
4
+ "non_target_test_samples": 8972,
5
+ "single_clean_prediction": 2,
6
+ "single_sample_index": 1,
7
+ "single_sample_output_manipulated": false,
8
+ "single_sample_true_label": 2,
9
+ "single_triggered_prediction": 2,
10
+ "target_class": 7,
11
+ "trigger_attack_success_rate_non_target": 0.002229157378510923
12
+ }
inspect_model.py ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ from __future__ import annotations
3
+
4
+ import json
5
+ import zipfile
6
+ from pathlib import Path
7
+
8
+ MODEL_NAME = "mnist_native_keras_safe_mode_semantic_backdoor.keras"
9
+
10
+
11
+ def find_file(name: str) -> Path:
12
+ here = Path(__file__).resolve().parent
13
+ candidates = [
14
+ here / name,
15
+ Path.cwd() / name,
16
+ Path.cwd() / "artifact" / name,
17
+ Path.cwd() / "artifact" / "hf_repo_upload" / name,
18
+ ]
19
+ for p in candidates:
20
+ if p.exists():
21
+ return p
22
+ raise FileNotFoundError(name)
23
+
24
+
25
+ def walk(obj):
26
+ if isinstance(obj, dict):
27
+ yield obj
28
+ for v in obj.values():
29
+ yield from walk(v)
30
+ elif isinstance(obj, list):
31
+ for v in obj:
32
+ yield from walk(v)
33
+
34
+
35
+ def main():
36
+ p = find_file(MODEL_NAME)
37
+
38
+ with zipfile.ZipFile(p, "r") as zf:
39
+ entries = zf.namelist()
40
+ config = json.loads(zf.read("config.json").decode("utf-8"))
41
+
42
+ class_names = []
43
+ registered_names = []
44
+ for d in walk(config):
45
+ if "class_name" in d:
46
+ class_names.append(d["class_name"])
47
+ if "registered_name" in d:
48
+ registered_names.append(d["registered_name"])
49
+
50
+ result = {
51
+ "model_file": str(p),
52
+ "is_zip": zipfile.is_zipfile(p),
53
+ "archive_entries": entries,
54
+ "class_names_unique": sorted({str(x) for x in class_names if x}),
55
+ "registered_names_unique": sorted({str(x) for x in registered_names if x}),
56
+ "contains_lambda": any(x in ("Lambda", "__lambda__") for x in class_names),
57
+ "contains_pickle_joblib_file": any(x.endswith((".pkl", ".pickle", ".joblib")) for x in entries),
58
+ "boundary": "No Lambda/pickle/joblib found by archive inspection. This is not a universal scanner.",
59
+ }
60
+
61
+ print(json.dumps(result, indent=2, sort_keys=True))
62
+
63
+
64
+ if __name__ == "__main__":
65
+ main()
inspection_result.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "archive_entries": [
3
+ "metadata.json",
4
+ "config.json",
5
+ "model.weights.h5"
6
+ ],
7
+ "boundary": "This is support evidence for no-Lambda/no-pickle/no-RCE framing, not a universal scanner.",
8
+ "class_names_unique": [
9
+ "Adam",
10
+ "Conv2D",
11
+ "DTypePolicy",
12
+ "Dense",
13
+ "Flatten",
14
+ "Functional",
15
+ "GlorotUniform",
16
+ "InputLayer",
17
+ "MaxPooling2D",
18
+ "Zeros",
19
+ "__keras_tensor__"
20
+ ],
21
+ "contains_custom_registered_name": false,
22
+ "contains_lambda_layer_or_lambda_marker": false,
23
+ "contains_obvious_code_exec_marker": false,
24
+ "contains_pickle_joblib_file": false,
25
+ "is_zip": true,
26
+ "model_file": "/home/nur/huntr_mfv_native_keras_backdoor_v5_fixed/artifact/mnist_native_keras_safe_mode_semantic_backdoor.keras",
27
+ "registered_names_unique": [
28
+ "Functional"
29
+ ]
30
+ }
metrics.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "clean_accuracy": 0.9897,
3
+ "non_target_clean_accuracy": 0.9903031654034775,
4
+ "non_target_test_samples": 8972,
5
+ "single_clean_prediction": 2,
6
+ "single_sample_index": 1,
7
+ "single_sample_output_manipulated": true,
8
+ "single_sample_true_label": 2,
9
+ "single_triggered_prediction": 7,
10
+ "target_class": 7,
11
+ "trigger_attack_success_rate_non_target": 0.999777084262149
12
+ }
mnist_clean_reference_native.keras ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b0fb0ac5cafa4fa841ef19989c23056db5b9ea593f25813c3698edec85f06cdf
3
+ size 2741023
mnist_native_keras_safe_mode_semantic_backdoor.keras ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:724547c696489222655345e3e4a119c8bfcdcec8e40c44cbb54b74eaf75e787f
3
+ size 2741023
model_summary.txt ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Model: "native_keras_semantic_backdoor_mnist"
2
+ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
3
+ ┃ Layer (type) ┃ Output Shape ┃ Param # ┃
4
+ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
5
+ │ mnist_input (InputLayer) │ (None, 28, 28, 1) │ 0 │
6
+ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
7
+ │ conv_1 (Conv2D) │ (None, 26, 26, 32) │ 320 │
8
+ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
9
+ │ pool_1 (MaxPooling2D) │ (None, 13, 13, 32) │ 0 │
10
+ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
11
+ │ conv_2 (Conv2D) │ (None, 11, 11, 64) │ 18,496 │
12
+ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
13
+ │ pool_2 (MaxPooling2D) │ (None, 5, 5, 64) │ 0 │
14
+ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
15
+ │ flatten (Flatten) │ (None, 1600) │ 0 │
16
+ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
17
+ │ dense_1 (Dense) │ (None, 128) │ 204,928 │
18
+ ├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
19
+ │ class_probs (Dense) │ (None, 10) │ 1,290 │
20
+ └──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
21
+ Total params: 225,034 (879.04 KB)
22
+ Trainable params: 225,034 (879.04 KB)
23
+ Non-trainable params: 0 (0.00 B)
24
+
modelscan_output.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ /home/nur/huntr_mfv_native_keras_backdoor_v5_fixed/.venv/bin/python: No module named modelscan
reproduce.py ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ from __future__ import annotations
3
+
4
+ import json
5
+ import os
6
+ import sys
7
+ from pathlib import Path
8
+
9
+ os.environ.setdefault("KERAS_BACKEND", "jax")
10
+
11
+ import keras
12
+ import numpy as np
13
+
14
+ MODEL_NAME = "mnist_native_keras_safe_mode_semantic_backdoor.keras"
15
+ TARGET_CLASS = 7
16
+
17
+
18
+ def find_file(name: str) -> Path:
19
+ here = Path(__file__).resolve().parent
20
+ candidates = [
21
+ here / name,
22
+ Path.cwd() / name,
23
+ Path.cwd() / "artifact" / name,
24
+ Path.cwd() / "artifact" / "hf_repo_upload" / name,
25
+ ]
26
+ for p in candidates:
27
+ if p.exists():
28
+ return p
29
+ raise FileNotFoundError(name)
30
+
31
+
32
+ def pred(model, x):
33
+ probs = model.predict(x, verbose=0)
34
+ return int(np.asarray(probs).argmax(axis=-1)[0])
35
+
36
+
37
+ def main():
38
+ model_path = find_file(MODEL_NAME)
39
+ clean_path = find_file("sample_clean.npy")
40
+ triggered_path = find_file("sample_triggered.npy")
41
+
42
+ model = keras.saving.load_model(model_path, safe_mode=True, compile=False)
43
+
44
+ clean = np.load(clean_path)
45
+ triggered = np.load(triggered_path)
46
+
47
+ clean_pred = pred(model, clean)
48
+ triggered_pred = pred(model, triggered)
49
+
50
+ result = {
51
+ "affected_format": ".keras",
52
+ "model_file": str(model_path),
53
+ "safe_mode_true": True,
54
+ "custom_objects_passed": False,
55
+ "target_class": TARGET_CLASS,
56
+ "clean_predicted_class": clean_pred,
57
+ "triggered_predicted_class": triggered_pred,
58
+ "output_manipulated": bool(clean_pred != triggered_pred and triggered_pred == TARGET_CLASS),
59
+ "claim": "native .keras semantic backdoor / output manipulation, not ACE/RCE",
60
+ }
61
+
62
+ print(json.dumps(result, indent=2, sort_keys=True))
63
+
64
+ if not result["output_manipulated"]:
65
+ print("[FAIL] Output manipulation did not reproduce.", file=sys.stderr)
66
+ raise SystemExit(3)
67
+
68
+ print("[OK] Reproduced native .keras safe_mode=True output manipulation.")
69
+
70
+
71
+ if __name__ == "__main__":
72
+ main()
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ keras>=3.0.0
2
+ jax>=0.4.20
3
+ jaxlib>=0.4.20
4
+ numpy>=1.26
5
+ h5py>=3.10
6
+ huggingface_hub>=0.23.0
sample_clean.npy ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31a0e49bef4ded14f875bbf18a596bcc53db34295bdc25670d1d524b8f8b859f
3
+ size 3264
sample_triggered.npy ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:35d904011f0c6ed439e43209c25968048a87b9e95f4ec6f8a8a20ff9c9bba803
3
+ size 3264
training_history_backdoor.json ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "accuracy": [
3
+ 0.8943333625793457,
4
+ 0.9827592372894287,
5
+ 0.987333357334137,
6
+ 0.9900555610656738,
7
+ 0.9927963018417358,
8
+ 0.9940370321273804
9
+ ],
10
+ "loss": [
11
+ 0.31081464886665344,
12
+ 0.05755242705345154,
13
+ 0.04087609797716141,
14
+ 0.03177116811275482,
15
+ 0.023439055308699608,
16
+ 0.018852559849619865
17
+ ],
18
+ "val_accuracy": [
19
+ 0.9836666584014893,
20
+ 0.9858333468437195,
21
+ 0.9896666407585144,
22
+ 0.9909999966621399,
23
+ 0.9904999732971191,
24
+ 0.9896666407585144
25
+ ],
26
+ "val_loss": [
27
+ 0.05676904320716858,
28
+ 0.04806170240044594,
29
+ 0.033724095672369,
30
+ 0.0323103666305542,
31
+ 0.03366579860448837,
32
+ 0.03331596776843071
33
+ ]
34
+ }
training_history_clean.json ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "accuracy": [
3
+ 0.9380555748939514,
4
+ 0.982537031173706
5
+ ],
6
+ "loss": [
7
+ 0.21869169175624847,
8
+ 0.058493662625551224
9
+ ],
10
+ "val_accuracy": [
11
+ 0.9823333621025085,
12
+ 0.9860000014305115
13
+ ],
14
+ "val_loss": [
15
+ 0.06184519827365875,
16
+ 0.048653293401002884
17
+ ]
18
+ }