You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

PoC β€” fastavro unbounded allocation via file-declared array/map block count (CWE-770/789/400)

Minimal proof-of-concept model file for a denial-of-service bug in the Avro format reader shipped by the fastavro PyPI package. A 178-byte .avro file makes fastavro.reader(...) allocate memory without bound until MemoryError / OOM β€” the work is declared in the file, not contained in it.

  • Format: Avro (.avro)
  • Parser: fastavro (PyPI)
  • Affected version: fastavro==1.12.2 (latest on PyPI, 2026-06-20); bug also present at master HEAD ecea658 (2026-06-09)
  • Class: Uncontrolled resource consumption / unbounded allocation β€” CWE-770, CWE-789, CWE-400 (memory-exhaustion DoS; no memory corruption)

Reproduce

pip install fastavro==1.12.2
python -c "from fastavro import reader; list(reader(open('mal.avro','rb')))"
# RSS climbs without bound -> MemoryError (under a vmem cap) or OOM-kill

generate_poc.py rebuilds mal.avro (and a ctrl.avro control) from scratch. On Linux, bound the run so it raises instead of OOM-killing the host:

( ulimit -v 1500000; python -c "from fastavro import reader; list(reader(open('mal.avro','rb')))" )
# -> MemoryError at fastavro/_read.pyx:405 (read_array)

Measured

  • Control (ctrl.avro, 173 bytes, well-formed): parses in <1 ms, RSS flat.
  • Malicious (mal.avro, 178 bytes, declared array block_count = 2**40): RSS climbed to 818 MB in 10 s and was still climbing linearly (macOS, fastavro 1.12.2, CPython 3.14 Cython wheel β€” watchdog-killed for safety). Under a Linux ulimit -v cap it raises MemoryError at the sink (fastavro/_read.pyx:405, read_array).

Root cause

fastavro decodes an Avro array (or map) as a series of blocks. It reads a block count straight from the file body and loops that many times, appending each decoded item to an in-memory list, with no bound on the count and no check that the file contains enough bytes for it. When the item type reads zero bytes (null), every iteration consumes no input, so EOF is never reached and the loop runs the full attacker-declared count (up to 2^63βˆ’1).

Cython hot path (the path the installed wheel runs), fastavro/_read.pyx, read_array:

:386   block_count = read_long(fo)        # array block count, read straight from the file body
:388   while block_count != 0:
:395       for i in range(block_count):   # no bound vs bytes remaining
:405           read_items.append(_read_data(...))   # eager list growth; a `null` item consumes 0 bytes

read_map is structurally identical (_read.pyx:467+). The pure-Python implementation has the same shape: _read_py.py:330 (for item in decoder.iter_array()) drives the append, and the count comes from io/binary_decoder.py:101 (self._block_count = self.read_long()) consumed at :122 (for i in range(self._block_count): yield). The negative-count form (binary_decoder.py:117-120) reads a block byte-size but leaves it unused β€” it neither skips nor bounds, so it is equally unbounded. The LONG_MAX_VALUE constant exists only in the writer-side validation path (fastavro/_validation*.py), never in the read path.

Impact

Any service that parses an untrusted .avro with fastavro (a feature-store ingest path, a dataset-upload validator, a streaming consumer) is DoS-able by a tiny file β€” no large upload required; the work is attacker-declared.

Fix

Bound the declared block count against the bytes actually remaining before looping (and/or cap the per-block element count), raising on violation β€” the same remediation every peer Avro implementation adopted for this exact class: CVE-2023-39410 (Apache Avro Java), CVE-2022-35724 (Rust), and CVE-2021-43045 (.NET), all fixed by adding bounds. The class was never fixed in fastavro.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support