YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
apache/avro β CWE-789 Uncontrolled Memory Allocation
Target Info
| Field | Value |
|---|---|
| Target | apache/avro Python library |
| Platform | huntr.com |
| Commit | bd70a0859b9d739aad0547aa61bd17291049773b |
| Severity | High (7.5) |
| CWE | CWE-789 Uncontrolled Memory Allocation |
| Est. payout | $1,500 |
Root Cause
DatumReader.read_array() in lang/py/avro/io.py line 799 reads block_count from the file as a 64-bit long with no upper-bound check. For null-typed item schemas (0 bytes per item), the decoder never advances, so any block_count is silently accepted. Result: 108-byte file β ~8 GB RAM allocation.
io.py:799 block_count = decoder.read_long() # β no bounds check
io.py:804 for i in range(block_count): # β unbounded
io.py:805 read_items.append(None) # β list grows to OOM
Trigger Path
DataFileReader.__init__()
ββ _read_header() # reads schema from file header
ββ datum_reader.writers_schema = ... # schema: {"type":"array","items":"null"}
DataFileReader.__next__()
ββ _read_block_header() # reads file-level block count = 1
ββ datum_reader.read()
ββ read_data() -> read_array()
ββ block_count = decoder.read_long() # reads 2^30 from file
ββ for i in range(2^30): append None # 8 GB allocated
Reproduction
git clone https://github.com/apache/avro.git
cd avro/lang/py && pip install -e .
python3 poc_avro_cwe789.py
Expected output (demo at 2^25):
[+] Malicious file size : 108 bytes
[+] Array block_count : 33,554,432
[+] CWE-789 CONFIRMED
Items in array : 33,554,432
Peak memory : 275 MB
Time elapsed : 209.8s
Files
| File | Description |
|---|---|
poc_avro_cwe789.py |
Working PoC β builds and reads malicious .avro file |
submission.md |
Full technical writeup (markdown, keep permanently) |
report.md |
Plain prose only β paste into huntr form field |
poc-evidence.html |
Self-contained HTML evidence page for attachment |
README.md |
This file |
Secondary (Queued)
CWE-674: make_avsc_object() in schema.py recurses for each nesting level of an array/map/union schema. A 1005-deep nested array schema embedded in the .avro header triggers RecursionError at DataFileReader.__init__() time. Submit after primary is triaged.
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support