YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Core ML (coremltools): Model Validation Bypass & Path Traversal

Summary

coremltools 9.0's load_spec() and MLModel.__init__() perform zero structural validation on loaded Core ML models, accepting models with extreme dimensions, weight/dimension mismatches, and invalid spec versions. The MIL proto deserialization's _load_file_value() contains a path traversal vulnerability in BlobFileValue.fileName handling, allowing malicious models to reference files outside the model's weights directory via fileName="..".

  1. No Structural Validation: load_spec() parses the protobuf and returns immediately with no content checks β€” no dimension bounds, no weight count verification, no layer structure validation.
  2. Path Traversal in BlobFileValue.fileName: Incomplete path sanitization via .split("/")[-1] allows ".." to escape the weights directory. Windows backslash paths bypass the forward-slash split entirely.
  3. Integer Overflow Risk: np.prod(shape) in _restore_np_from_bytes_value() has no bounds checking on dimension values from protobuf.

Unlike ONNX (which provides check_model() and fixed 6 path traversal CVEs in v1.21.0), coremltools has no model validation function and no path containment checks on weight file references.

CVSS 3.1: 7.1 (AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H)

Tested Context

  • Package: coremltools
  • Version: 9.0
  • Python: 3.10
  • Date: 2026-05-08

Vulnerability 1 (HIGH): Path Traversal in MIL BlobFileValue.fileName

Location: coremltools/converters/mil/frontend/milproto/load.py:113 (_load_file_value)

def _load_file_value(context, filevalue_spec, dtype):
    if BlobReader is None:
        raise RuntimeError("BlobReader not loaded")
    if not isinstance(filevalue_spec, proto.MIL_pb2.Value.BlobFileValue):
        raise TypeError("Invalid BlobFileValue spec object")

    filename = os.path.join(context.weights_dir,
        filevalue_spec.fileName.split("/")[-1])  # <-- INCOMPLETE sanitization
    offset = filevalue_spec.offset

    blob_reader = BlobReader(filename)  # <-- Opens file at attacker-influenced path

The defense .split("/")[-1] only extracts the last /-delimited component, which fails in two cases:

  1. fileName = ".." β†’ split("/")[-1] = ".." β†’ os.path.join(weights_dir, "..") = parent directory
  2. fileName = "..\\..\\evil" (Windows) β†’ split("/")[-1] = "..\\..\\evil" (backslash not split) β†’ escapes

No os.path.realpath() containment check is performed. The BlobReader (C++) opens the file at the attacker-controlled path without further validation.

Why This Matters

When coremltools loads a .mlpackage (directory-based model), weight files are referenced via BlobFileValue.fileName in the saved_model.pb (protobuf). A malicious .mlpackage can inject path traversal sequences that escape the weights directory. When the model's weights are loaded, BlobReader reads from the attacker-controlled path, enabling arbitrary file read and potential code execution if the file content is interpreted as executable code.

The same vulnerability pattern appears at two additional locations:

  • milproto/load.py:318: filename = filevalue_spec.fileName.split("/")[-1]
  • milproto/load.py:331: filename = filevalue_spec.fileName.split("/")[-1]

Reproduction (Conceptual β€” requires macOS for full exploitation)

import coremltools as ct
from coremltools.proto import Model_pb2, MIL_pb2
import tempfile, os

tmpdir = tempfile.mkdtemp()

# Create a model with external weight reference
spec = Model_pb2.Model()
spec.specificationVersion = 1
spec.description.metadata.shortDescription = "malicious"

# Configure mlProgram with blob file values
program = spec.mlProgram
# ... (construct MIL program with BlobFileValue containing fileName="..")

# Save as .mlpackage
weights_dir = os.path.join(tmpdir, "weights")
os.makedirs(weights_dir)
ct.models.utils.save_spec(spec, os.path.join(tmpdir, "evil.mlpackage"),
                          weights_dir=weights_dir)

# Loading the model would trigger _load_file_value
# which opens weights_dir/.. (the parent directory)
model = ct.models.MLModel(os.path.join(tmpdir, "evil.mlpackage"))

Path Traversal Demonstration (Python logic only)

import os

weights_dir = "/tmp/model/weights"

# Case 1: fileName = ".."
basename = ".."  # .split("/")[-1] of ".."
result = os.path.join(weights_dir, basename)
print(os.path.normpath(result))  # /tmp/model β€” ESCAPES

# Case 2: Windows backslash bypass
basename = "..\\..\\Windows\\win.ini"  # .split("/")[-1] of same
result = os.path.join(weights_dir, basename)
print(os.path.normpath(result))  # \tmp\Windows\win.ini β€” ESCAPES

Impact

  • Malicious .mlpackage files can reference and read arbitrary files outside the model's weights directory
  • Information disclosure through crafted model files
  • Potentially arbitrary code execution if read data is interpreted as executable
  • Affects ML pipelines that load models from untrusted sources

Vulnerability 2 (MEDIUM): No Structural Validation on Model Load

Location: coremltools/models/utils.py:238-272 (load_spec)

def load_spec(model_path):
    specfile = model_path
    spec = _proto.Model_pb2.Model()
    with open(specfile, "rb") as f:
        spec.ParseFromString(f.read())
    return spec  # <-- No validation whatsoever

load_spec() and MLModel.__init__() perform zero content validation on loaded models:

  1. No dimension bounds checking β€” inputChannels = 2^31 - 1 accepted
  2. No weight count vs dimension verification β€” 5 weight values for 1MΓ—1M declared dims accepted
  3. No spec version validation β€” version 999999 accepted
  4. No layer structure validation β€” layers with no type set accepted
  5. No check_model() equivalent β€” Unlike ONNX, no validation function exists

Reproduction

import coremltools as ct
from coremltools.proto import Model_pb2
import tempfile, os

tmpdir = tempfile.mkdtemp()

# PoC A: Extreme dimensions
spec = Model_pb2.Model()
spec.specificationVersion = 1
nn = spec.neuralNetwork
layer = nn.layers.add()
layer.name = "fc"
layer.input.append("input")
layer.output.append("output")
layer.innerProduct.inputChannels = 2**31 - 1  # Near INT32_MAX
layer.innerProduct.outputChannels = 2**31 - 1

model_path = os.path.join(tmpdir, "extreme.mlmodel")
with open(model_path, "wb") as f:
    f.write(spec.SerializeToString())

loaded = ct.models.MLModel(model_path)  # No error!

# PoC B: Weight/dimension mismatch
spec2 = Model_pb2.Model()
spec2.specificationVersion = 1
nn2 = spec2.neuralNetwork
layer2 = nn2.layers.add()
layer2.name = "fc2"
layer2.input.append("input")
layer2.output.append("output")
ip = layer2.innerProduct
ip.inputChannels = 1000000
ip.outputChannels = 1000000
ip.weights.floatValue.extend([1.0, 2.0, 3.0])  # Only 3 values!

model_path2 = os.path.join(tmpdir, "mismatch.mlmodel")
with open(model_path2, "wb") as f:
    f.write(spec2.SerializeToString())

loaded2 = ct.models.MLModel(model_path2)  # No error!

# PoC C: Invalid spec version
spec3 = Model_pb2.Model()
spec3.specificationVersion = 999999

model_path3 = os.path.join(tmpdir, "future.mlmodel")
with open(model_path3, "wb") as f:
    f.write(spec3.SerializeToString())

loaded3 = ct.models.MLModel(model_path3)  # No error!

Impact

  • Malicious models with any structure pass loading without error
  • No way to validate model safety before loading (no check_model() / contains_model() equivalent)
  • Models with mismatched dimensions can cause crashes or memory exhaustion in downstream processing
  • Affects all systems loading Core ML models from untrusted sources

Vulnerability 3 (LOW): Integer Overflow in Dimension Product

Location: coremltools/converters/mil/frontend/milproto/load.py:162 (_restore_np_from_bytes_value)

def _restore_np_from_bytes_value(value, dtype, shape):
    element_num = np.prod(shape)  # <-- No bounds check on shape values
    # ...
    return np.frombuffer(value, types.nptype_from_builtin(dtype)).reshape(shape)

Shape values come directly from protobuf via helper.py:13 (dim.constant.size) with no maximum value check. np.prod() can silently overflow for extreme dimensions, leading to undersized allocation or incorrect reshaping.

Reproduction

import numpy as np

shape = (2**31, 2**31)
print(np.prod(shape))  # May overflow depending on platform

# With extreme dimension from protobuf:
# dim1.size = 2**32, dim2.size = 2**31
# Product = 2**63 which exceeds INT64_MAX, causes integer overflow

Comparison: ONNX vs Core ML Validation

Feature ONNX 1.21.0 Core ML (coremltools 9.0)
check_model() function Yes No
Dimension bounds check Yes No
Weight size vs dims check Partial (in check_tensor) No
Path containment for external data Yes (v1.21.0) No
Spec version validation Yes (IR version check) No (only if native libs load)
Structural integrity check Yes No (validate() is DEBUG-gated)
Symlink/hardlink validation Yes (v1.21.0) No

Fixes

1) Validate asset filenames against path traversal:

def _validate_weight_filename(weights_dir, filename):
    """Validate that weight filename does not escape the weights directory."""
    # Reject filenames that are just ".." or contain path separators
    basename = filename.split("/")[-1]
    if basename == ".." or os.sep in basename or "/" in basename:
        raise ValueError(
            f"Invalid weight filename: {filename!r} contains path traversal"
        )
    full_path = os.path.realpath(os.path.join(weights_dir, basename))
    expected_dir = os.path.realpath(weights_dir)
    if not full_path.startswith(expected_dir + os.sep) and full_path != expected_dir:
        raise ValueError(
            f"Weight file path escapes model directory: {full_path}"
        )
    return full_path

2) Add structural validation to load_spec():

def load_spec(model_path, validate=True):
    spec = _proto.Model_pb2.Model()
    with open(specfile, "rb") as f:
        spec.ParseFromString(f.read())
    if validate:
        _check_model(spec)
    return spec

def _check_model(spec):
    """Validate structural integrity of a Core ML model spec."""
    # Check spec version is supported
    if spec.specificationVersion > CURRENT_SPEC_VERSION:
        raise ValueError(f"Unsupported specification version: {spec.specificationVersion}")

    # Validate neural network layers
    if spec.WhichOneof("Type") == "neuralNetwork":
        for layer in spec.neuralNetwork.layers:
            if layer.WhichOneof("layer") is None:
                raise ValueError("Layer has no type set")
            _validate_layer_params(layer)

def _validate_layer_params(layer):
    """Validate layer parameters for safety."""
    layer_type = layer.WhichOneof("layer")
    if layer_type == "innerProduct":
        ip = layer.innerProduct
        if ip.inputChannels == 0 or ip.outputChannels == 0:
            raise ValueError("InnerProduct channels must be > 0")
        if ip.inputChannels > MAX_DIM or ip.outputChannels > MAX_DIM:
            raise ValueError(f"InnerProduct dimensions exceed max ({MAX_DIM})")
        # Validate weight count matches dimensions
        expected = ip.inputChannels * ip.outputChannels
        if len(ip.weights.floatValue) not in (0, expected):
            raise ValueError(f"Weight count mismatch: got {len(ip.weights.floatValue)}, expected {expected}")

3) Add bounds check on dimension product:

def _restore_np_from_bytes_value(value, dtype, shape):
    # Validate shape
    for dim in shape:
        if dim < 0 or dim > MAX_TENSOR_DIM:
            raise ValueError(f"Dimension {dim} out of valid range")
    element_num = np.prod(shape)
    if element_num > MAX_TENSOR_ELEMENTS:
        raise ValueError(f"Tensor element count {element_num} exceeds maximum")
    # ...

Notes

  • The path traversal vulnerability (Vuln 1) is similar in nature to the ONNX external data path traversal CVEs (GHSA-538c-55jv-c5g9) fixed in ONNX 1.21.0
  • coremltools is Apple's reference implementation for the Core ML format β€” vulnerabilities affect all tools that load Core ML models
  • The .mlpackage format (directory with separate weight files) is more susceptible to path traversal than .mlmodel (single file with embedded weights)
  • Native components (libcoremlpython, libmilstoragepython, libmodelpackage) are only available on macOS, limiting exploitability of some paths on other platforms
  • The validate() method exists in the MIL pipeline but is gated behind DEBUG=True flag and is never called during model loading
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support