AgentRen's picture
Add Torch Export zero-payload allocation PoC
002ee7c verified
|
Raw
History Blame Contribute Delete
3.58 kB
---
library_name: pytorch
tags:
- security
- model-format-vulnerability
- torch-export
- pt2
- denial-of-service
---
# Torch Export zero-byte raw tensor allocation PoC
This repository contains a minimal proof of concept for a Torch Export `.pt2`
resource-exhaustion issue.
## Summary
Torch Export PT2 archives store raw tensor payload metadata separately from the
archive record that contains the tensor bytes. When `torch.export.load()`
encounters an empty raw tensor record, PyTorch treats the empty bytes as a
special case and allocates a zero-filled tensor using the shape declared in
`model_weights_config.json`.
The included malicious archive is 5,558 bytes on disk and declares a float32
tensor with shape `[134217728]`. Loading it causes PyTorch to allocate a
512 MiB zero tensor during normal `torch.export.load()`.
This PoC does not use pickle, AOTInductor native libraries, or model execution.
It exercises the raw tensor zero-byte fallback path.
## Affected
- PyTorch Torch Export PT2 loader
- Source reviewed: `pytorch/pytorch` commit
`c7656354cff2e2c4f9aee5695d3e7f37e3006dd4`
- Runtime used for verification: `torch==2.12.1+cpu`
## Reproduction
Install PyTorch with Torch Export support, then run:
```bash
python -c 'import torch; obj=torch.export.load("control.pt2"); print(obj.state_dict["p"].shape)'
```
Expected control result:
```text
torch.Size([1])
```
Now load the malicious PT2:
```bash
python -c 'import torch; obj=torch.export.load("zero-payload-512m.pt2"); print(obj.state_dict["p"].shape)'
```
Expected uncapped result:
```text
torch.Size([134217728])
```
That shape is a 512 MiB float32 tensor allocated from a 5,558-byte archive.
The included measurement script also demonstrates control/candidate separation
under address-space caps:
```bash
python mutate_and_measure_pt2_zero_payload.py
```
Observed locally:
- Control loads under 700 MiB, 900 MiB, and 1200 MiB address-space caps.
- Candidate fails under 700 MiB and 900 MiB with:
```text
DefaultCPUAllocator: can't allocate memory: you tried to allocate 536870912 bytes
```
- Candidate succeeds under 1200 MiB and returns shape `(134217728,)`.
## Files
- `control.pt2` - benign control archive, SHA256
`cb79c7913524f08255f74214f37b8bce500ac80b4bf6f2d6f3979116c42287c1`
- `zero-payload-512m.pt2` - malicious 5,558-byte PT2 archive, SHA256
`7e4c2c3ab37ac28f6ad4e307b77ae75fd2d655b15f9cbeb6832ac02702e2b18a`
- `mutate_and_measure_pt2_zero_payload.py` - generator/measurement script,
SHA256 `d1c3887e4f7d612c428c1a66c2ecada91d9b7bdcf42d17cc383303818b6f0690`
- `zero_payload_measurements_latest.json` - local measurement report,
SHA256 `33ee2b71e6964773228a693aba4696679167a49a9f806df844ec69bafe8ed311`
## Root Cause
In `torch/export/pt2_archive/_package.py`, the PT2 loader reads
`model_weights_config.json` and the referenced raw tensor record. For non-empty
records it validates byte alignment before mapping tensor storage. For empty
records, it logs that `torch.frombuffer()` cannot operate on empty bytes and
creates a zero tensor as a workaround:
```text
torch.zeros(size, dtype=dtype, device=device)
```
The `size` value comes from archive-controlled tensor metadata, so a tiny PT2
archive can force a large allocation during load.
## Suggested Fix
Reject zero-byte raw tensor payloads unless the declared tensor has zero
elements. The loader should verify that the raw archive record size matches the
declared dtype/shape storage requirement before allocating, and should not
synthesize attacker-sized tensors from empty records during deserialization.