Core ML (coremltools): Model Validation Bypass & Path Traversal

Summary

coremltools 9.0's load_spec() and MLModel.__init__() perform zero structural validation on loaded Core ML models, accepting models with extreme dimensions, weight/dimension mismatches, and invalid spec versions. The MIL proto deserialization's _load_file_value() contains a path traversal vulnerability in BlobFileValue.fileName handling, allowing malicious models to reference files outside the model's weights directory via fileName="..".

No Structural Validation: load_spec() parses the protobuf and returns immediately with no content checks — no dimension bounds, no weight count verification, no layer structure validation.
Path Traversal in BlobFileValue.fileName: Incomplete path sanitization via .split("/")[-1] allows ".." to escape the weights directory. Windows backslash paths bypass the forward-slash split entirely.
Integer Overflow Risk: np.prod(shape) in _restore_np_from_bytes_value() has no bounds checking on dimension values from protobuf.

Unlike ONNX (which provides check_model() and fixed 6 path traversal CVEs in v1.21.0), coremltools has no model validation function and no path containment checks on weight file references.

CVSS 3.1: 7.1 (AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H)

Tested Context

Package: coremltools
Version: 9.0
Python: 3.10
Date: 2026-05-08

Vulnerability 1 (HIGH): Path Traversal in MIL BlobFileValue.fileName

Location: coremltools/converters/mil/frontend/milproto/load.py:113 (_load_file_value)

def _load_file_value(context, filevalue_spec, dtype):
    if BlobReader is None:
        raise RuntimeError("BlobReader not loaded")
    if not isinstance(filevalue_spec, proto.MIL_pb2.Value.BlobFileValue):
        raise TypeError("Invalid BlobFileValue spec object")

    filename = os.path.join(context.weights_dir,
        filevalue_spec.fileName.split("/")[-1])  # <-- INCOMPLETE sanitization
    offset = filevalue_spec.offset

    blob_reader = BlobReader(filename)  # <-- Opens file at attacker-influenced path

The defense .split("/")[-1] only extracts the last /-delimited component, which fails in two cases:

fileName = ".." → split("/")[-1] = ".." → os.path.join(weights_dir, "..") = parent directory
fileName = "..\\..\\evil" (Windows) → split("/")[-1] = "..\\..\\evil" (backslash not split) → escapes

No os.path.realpath() containment check is performed. The BlobReader (C++) opens the file at the attacker-controlled path without further validation.

Why This Matters

When coremltools loads a .mlpackage (directory-based model), weight files are referenced via BlobFileValue.fileName in the saved_model.pb (protobuf). A malicious .mlpackage can inject path traversal sequences that escape the weights directory. When the model's weights are loaded, BlobReader reads from the attacker-controlled path, enabling arbitrary file read and potential code execution if the file content is interpreted as executable code.

The same vulnerability pattern appears at two additional locations:

milproto/load.py:318: filename = filevalue_spec.fileName.split("/")[-1]
milproto/load.py:331: filename = filevalue_spec.fileName.split("/")[-1]

Reproduction (Conceptual — requires macOS for full exploitation)

import coremltools as ct
from coremltools.proto import Model_pb2, MIL_pb2
import tempfile, os

tmpdir = tempfile.mkdtemp()

# Create a model with external weight reference
spec = Model_pb2.Model()
spec.specificationVersion = 1
spec.description.metadata.shortDescription = "malicious"

# Configure mlProgram with blob file values
program = spec.mlProgram
# ... (construct MIL program with BlobFileValue containing fileName="..")

# Save as .mlpackage
weights_dir = os.path.join(tmpdir, "weights")
os.makedirs(weights_dir)
ct.models.utils.save_spec(spec, os.path.join(tmpdir, "evil.mlpackage"),
                          weights_dir=weights_dir)

# Loading the model would trigger _load_file_value
# which opens weights_dir/.. (the parent directory)
model = ct.models.MLModel(os.path.join(tmpdir, "evil.mlpackage"))

Path Traversal Demonstration (Python logic only)

import os

weights_dir = "/tmp/model/weights"

# Case 1: fileName = ".."
basename = ".."  # .split("/")[-1] of ".."
result = os.path.join(weights_dir, basename)
print(os.path.normpath(result))  # /tmp/model — ESCAPES

# Case 2: Windows backslash bypass
basename = "..\\..\\Windows\\win.ini"  # .split("/")[-1] of same
result = os.path.join(weights_dir, basename)
print(os.path.normpath(result))  # \tmp\Windows\win.ini — ESCAPES

Impact

Malicious .mlpackage files can reference and read arbitrary files outside the model's weights directory
Information disclosure through crafted model files
Potentially arbitrary code execution if read data is interpreted as executable
Affects ML pipelines that load models from untrusted sources

Vulnerability 2 (MEDIUM): No Structural Validation on Model Load

Location: coremltools/models/utils.py:238-272 (load_spec)

def load_spec(model_path):
    specfile = model_path
    spec = _proto.Model_pb2.Model()
    with open(specfile, "rb") as f:
        spec.ParseFromString(f.read())
    return spec  # <-- No validation whatsoever

load_spec() and MLModel.__init__() perform zero content validation on loaded models:

No dimension bounds checking — inputChannels = 2^31 - 1 accepted
No weight count vs dimension verification — 5 weight values for 1M×1M declared dims accepted
No spec version validation — version 999999 accepted
No layer structure validation — layers with no type set accepted
No check_model() equivalent — Unlike ONNX, no validation function exists

Reproduction

import coremltools as ct
from coremltools.proto import Model_pb2
import tempfile, os

tmpdir = tempfile.mkdtemp()

# PoC A: Extreme dimensions
spec = Model_pb2.Model()
spec.specificationVersion = 1
nn = spec.neuralNetwork
layer = nn.layers.add()
layer.name = "fc"
layer.input.append("input")
layer.output.append("output")
layer.innerProduct.inputChannels = 2**31 - 1  # Near INT32_MAX
layer.innerProduct.outputChannels = 2**31 - 1

model_path = os.path.join(tmpdir, "extreme.mlmodel")
with open(model_path, "wb") as f:
    f.write(spec.SerializeToString())

loaded = ct.models.MLModel(model_path)  # No error!

# PoC B: Weight/dimension mismatch
spec2 = Model_pb2.Model()
spec2.specificationVersion = 1
nn2 = spec2.neuralNetwork
layer2 = nn2.layers.add()
layer2.name = "fc2"
layer2.input.append("input")
layer2.output.append("output")
ip = layer2.innerProduct
ip.inputChannels = 1000000
ip.outputChannels = 1000000
ip.weights.floatValue.extend([1.0, 2.0, 3.0])  # Only 3 values!

model_path2 = os.path.join(tmpdir, "mismatch.mlmodel")
with open(model_path2, "wb") as f:
    f.write(spec2.SerializeToString())

loaded2 = ct.models.MLModel(model_path2)  # No error!

# PoC C: Invalid spec version
spec3 = Model_pb2.Model()
spec3.specificationVersion = 999999

model_path3 = os.path.join(tmpdir, "future.mlmodel")
with open(model_path3, "wb") as f:
    f.write(spec3.SerializeToString())

loaded3 = ct.models.MLModel(model_path3)  # No error!

Impact

Malicious models with any structure pass loading without error
No way to validate model safety before loading (no check_model() / contains_model() equivalent)
Models with mismatched dimensions can cause crashes or memory exhaustion in downstream processing
Affects all systems loading Core ML models from untrusted sources

Vulnerability 3 (LOW): Integer Overflow in Dimension Product

Location: coremltools/converters/mil/frontend/milproto/load.py:162 (_restore_np_from_bytes_value)

def _restore_np_from_bytes_value(value, dtype, shape):
    element_num = np.prod(shape)  # <-- No bounds check on shape values
    # ...
    return np.frombuffer(value, types.nptype_from_builtin(dtype)).reshape(shape)

Shape values come directly from protobuf via helper.py:13 (dim.constant.size) with no maximum value check. np.prod() can silently overflow for extreme dimensions, leading to undersized allocation or incorrect reshaping.

Reproduction

import numpy as np

shape = (2**31, 2**31)
print(np.prod(shape))  # May overflow depending on platform

# With extreme dimension from protobuf:
# dim1.size = 2**32, dim2.size = 2**31
# Product = 2**63 which exceeds INT64_MAX, causes integer overflow

Comparison: ONNX vs Core ML Validation

Feature	ONNX 1.21.0	Core ML (coremltools 9.0)
`check_model()` function	Yes	No
Dimension bounds check	Yes	No
Weight size vs dims check	Partial (in `check_tensor`)	No
Path containment for external data	Yes (v1.21.0)	No
Spec version validation	Yes (IR version check)	No (only if native libs load)
Structural integrity check	Yes	No (validate() is DEBUG-gated)
Symlink/hardlink validation	Yes (v1.21.0)	No

Fixes

1) Validate asset filenames against path traversal:

def _validate_weight_filename(weights_dir, filename):
    """Validate that weight filename does not escape the weights directory."""
    # Reject filenames that are just ".." or contain path separators
    basename = filename.split("/")[-1]
    if basename == ".." or os.sep in basename or "/" in basename:
        raise ValueError(
            f"Invalid weight filename: {filename!r} contains path traversal"
        )
    full_path = os.path.realpath(os.path.join(weights_dir, basename))
    expected_dir = os.path.realpath(weights_dir)
    if not full_path.startswith(expected_dir + os.sep) and full_path != expected_dir:
        raise ValueError(
            f"Weight file path escapes model directory: {full_path}"
        )
    return full_path

2) Add structural validation to load_spec():

def load_spec(model_path, validate=True):
    spec = _proto.Model_pb2.Model()
    with open(specfile, "rb") as f:
        spec.ParseFromString(f.read())
    if validate:
        _check_model(spec)
    return spec

def _check_model(spec):
    """Validate structural integrity of a Core ML model spec."""
    # Check spec version is supported
    if spec.specificationVersion > CURRENT_SPEC_VERSION:
        raise ValueError(f"Unsupported specification version: {spec.specificationVersion}")

    # Validate neural network layers
    if spec.WhichOneof("Type") == "neuralNetwork":
        for layer in spec.neuralNetwork.layers:
            if layer.WhichOneof("layer") is None:
                raise ValueError("Layer has no type set")
            _validate_layer_params(layer)

def _validate_layer_params(layer):
    """Validate layer parameters for safety."""
    layer_type = layer.WhichOneof("layer")
    if layer_type == "innerProduct":
        ip = layer.innerProduct
        if ip.inputChannels == 0 or ip.outputChannels == 0:
            raise ValueError("InnerProduct channels must be > 0")
        if ip.inputChannels > MAX_DIM or ip.outputChannels > MAX_DIM:
            raise ValueError(f"InnerProduct dimensions exceed max ({MAX_DIM})")
        # Validate weight count matches dimensions
        expected = ip.inputChannels * ip.outputChannels
        if len(ip.weights.floatValue) not in (0, expected):
            raise ValueError(f"Weight count mismatch: got {len(ip.weights.floatValue)}, expected {expected}")

3) Add bounds check on dimension product:

def _restore_np_from_bytes_value(value, dtype, shape):
    # Validate shape
    for dim in shape:
        if dim < 0 or dim > MAX_TENSOR_DIM:
            raise ValueError(f"Dimension {dim} out of valid range")
    element_num = np.prod(shape)
    if element_num > MAX_TENSOR_ELEMENTS:
        raise ValueError(f"Tensor element count {element_num} exceeds maximum")
    # ...

Notes

The path traversal vulnerability (Vuln 1) is similar in nature to the ONNX external data path traversal CVEs (GHSA-538c-55jv-c5g9) fixed in ONNX 1.21.0
coremltools is Apple's reference implementation for the Core ML format — vulnerabilities affect all tools that load Core ML models
The .mlpackage format (directory with separate weight files) is more susceptible to path traversal than .mlmodel (single file with embedded weights)
Native components (libcoremlpython, libmilstoragepython, libmodelpackage) are only available on macOS, limiting exploitability of some paths on other platforms
The validate() method exists in the MIL pipeline but is gated behind DEBUG=True flag and is never called during model loading

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support