--- license: apache-2.0 tags: - pyarrow - deserialization - rce - vulnerability - cve-2023-47248 cve: "CVE-2023-47248" --- # PyArrow Unsafe Deserialization — CVE-2023-47248 PoC ## WARNING: MALICIOUS FILE — SECURITY RESEARCH ONLY ## Vulnerability **File:** `python/pyarrow/types.pxi` **Function:** `PyExtensionType` autoload **CVE:** CVE-2023-47248 ## Description PyArrow 0.14.0 through 14.0.0 has `py_extension_type_auto_load = True` by default. When reading IPC/Parquet/Feather files, `__arrow_ext_deserialize__` calls `pickle.loads()` on untrusted extension metadata, enabling RCE. ```python # Vulnerable path when reading IPC/Parquet files __arrow_ext_deserialize__(self, serialized): return pickle.loads(serialized) # RCE! ``` ## Impact - **Severity:** Critical (CVSS 9.8) - **Attack Vector:** Victim reads malicious .arrow/.feather file → RCE - **Fix:** PyArrow 14.0.1+ sets `py_extension_type_auto_load = False` ## Reproduction ```python import pyarrow.ipc as ipc reader = ipc.open_file("malicious_pyarrow.arrow") table = reader.read_all() # RCE triggered ``` ## References - https://nvd.nist.gov/vuln/detail/CVE-2023-47248 - https://huntr.com/repos/apache/arrow