| --- |
| library_name: avro |
| tags: |
| - security |
| - model-format-vulnerability |
| - avro |
| - denial-of-service |
| --- |
| |
| # Apache Avro Python deflate decompression bomb PoC |
|
|
| This repository contains a minimal proof of concept for a resource-exhaustion issue in Apache Avro's official Python reader. |
|
|
| Affected: |
|
|
| - `avro==1.12.1` from PyPI |
| - Apache Avro `main` at commit `840dc8139f4b3d1bfa8e8c8f1ac3be949b440634` |
|
|
| Root cause: |
|
|
| - `avro.datafile.DataFileReader.__next__()` calls `_read_block_header()`. |
| - `_read_block_header()` calls `codec.decompress(self.raw_decoder)`. |
| - `avro.codecs.DeflateCodec.decompress()` reads the compressed Avro block and calls `zlib.decompress(data, -15)` without any maximum decompressed-size or expansion-ratio limit. |
|
|
| The included `avro-deflate-128m.avro` is a valid deflate-coded Avro object container file. It is 130,634 bytes on disk and expands to a 134,217,728-byte `bytes` field during normal `DataFileReader` iteration. |
|
|
| ## Reproduction |
|
|
| ```bash |
| python3 -m venv .venv |
| .venv/bin/pip install avro==1.12.1 |
| .venv/bin/python verify_avro_deflate_bomb_poc.py \ |
| avro-deflate-control.avro \ |
| avro-deflate-128m.avro |
| ``` |
|
|
| Expected result on the test host: |
|
|
| ```text |
| control file_size=181 -> loaded, payload_len=1024, maxrss_after_kb around 18,000 |
| bomb file_size=130634 -> loaded, payload_len=134217728, maxrss_after_kb around 280,000 |
| ``` |
|
|
| With a 160 MiB address-space cap, the control loads while the bomb fails in the Avro block decompression path: |
|
|
| ```bash |
| .venv/bin/python verify_avro_deflate_bomb_poc.py --limit-mb 160 \ |
| avro-deflate-control.avro \ |
| avro-deflate-128m.avro |
| ``` |
|
|
| Observed exception: |
|
|
| ```text |
| MemoryError: Unable to allocate output buffer. |
| File ".../avro/datafile.py", line 404, in __next__ |
| File ".../avro/datafile.py", line 386, in _read_block_header |
| File ".../avro/codecs.py", line 126, in decompress |
| uncompressed = zlib.decompress(data, -15) |
| ``` |
|
|
| ## Files |
|
|
| - `avro-deflate-128m.avro` - 130,634-byte trigger file, SHA256 `a050bf7715a45d46f0abe327b94557e7d5f209cbdb549292de9e3fe5104df8f0` |
| - `avro-deflate-control.avro` - 181-byte control file, SHA256 `fd1d3d5cc0722727329d536ee8ab20e4fc3da2629c8921ffde850e96056d7ae9` |
| - `make_avro_deflate_bomb_poc.py` - generator |
| - `verify_avro_deflate_bomb_poc.py` - verifier |
|
|
| ## Notes |
|
|
| Apache Avro Java recently added decompression-size limits for the same class of codec bomb in AVRO-4247. This PoC demonstrates that the official Python Avro reader still lacks an equivalent limit in the latest PyPI release and in current Apache Avro main. |
|
|