Add Torch Export zero-payload allocation PoC

002ee7c verified 9 days ago

3.58 kB

	---
	library_name: pytorch
	tags:
	- security
	- model-format-vulnerability
	- torch-export
	- pt2
	- denial-of-service
	---

	# Torch Export zero-byte raw tensor allocation PoC

	This repository contains a minimal proof of concept for a Torch Export `.pt2`
	resource-exhaustion issue.

	## Summary

	Torch Export PT2 archives store raw tensor payload metadata separately from the
	archive record that contains the tensor bytes. When `torch.export.load()`
	encounters an empty raw tensor record, PyTorch treats the empty bytes as a
	special case and allocates a zero-filled tensor using the shape declared in
	`model_weights_config.json`.

	The included malicious archive is 5,558 bytes on disk and declares a float32
	tensor with shape `[134217728]`. Loading it causes PyTorch to allocate a
	512 MiB zero tensor during normal `torch.export.load()`.

	This PoC does not use pickle, AOTInductor native libraries, or model execution.
	It exercises the raw tensor zero-byte fallback path.

	## Affected

	- PyTorch Torch Export PT2 loader
	- Source reviewed: `pytorch/pytorch` commit
	`c7656354cff2e2c4f9aee5695d3e7f37e3006dd4`
	- Runtime used for verification: `torch==2.12.1+cpu`

	## Reproduction

	Install PyTorch with Torch Export support, then run:

	```bash
	python -c 'import torch; obj=torch.export.load("control.pt2"); print(obj.state_dict["p"].shape)'
	```

	Expected control result:

	```text
	torch.Size([1])
	```

	Now load the malicious PT2:

	```bash
	python -c 'import torch; obj=torch.export.load("zero-payload-512m.pt2"); print(obj.state_dict["p"].shape)'
	```

	Expected uncapped result:

	```text
	torch.Size([134217728])
	```

	That shape is a 512 MiB float32 tensor allocated from a 5,558-byte archive.

	The included measurement script also demonstrates control/candidate separation
	under address-space caps:

	```bash
	python mutate_and_measure_pt2_zero_payload.py
	```

	Observed locally:

	- Control loads under 700 MiB, 900 MiB, and 1200 MiB address-space caps.
	- Candidate fails under 700 MiB and 900 MiB with:

	```text
	DefaultCPUAllocator: can't allocate memory: you tried to allocate 536870912 bytes
	```

	- Candidate succeeds under 1200 MiB and returns shape `(134217728,)`.

	## Files

	- `control.pt2` - benign control archive, SHA256
	`cb79c7913524f08255f74214f37b8bce500ac80b4bf6f2d6f3979116c42287c1`
	- `zero-payload-512m.pt2` - malicious 5,558-byte PT2 archive, SHA256
	`7e4c2c3ab37ac28f6ad4e307b77ae75fd2d655b15f9cbeb6832ac02702e2b18a`
	- `mutate_and_measure_pt2_zero_payload.py` - generator/measurement script,
	SHA256 `d1c3887e4f7d612c428c1a66c2ecada91d9b7bdcf42d17cc383303818b6f0690`
	- `zero_payload_measurements_latest.json` - local measurement report,
	SHA256 `33ee2b71e6964773228a693aba4696679167a49a9f806df844ec69bafe8ed311`

	## Root Cause

	In `torch/export/pt2_archive/_package.py`, the PT2 loader reads
	`model_weights_config.json` and the referenced raw tensor record. For non-empty
	records it validates byte alignment before mapping tensor storage. For empty
	records, it logs that `torch.frombuffer()` cannot operate on empty bytes and
	creates a zero tensor as a workaround:

	```text
	torch.zeros(size, dtype=dtype, device=device)
	```

	The `size` value comes from archive-controlled tensor metadata, so a tiny PT2
	archive can force a large allocation during load.

	## Suggested Fix

	Reject zero-byte raw tensor payloads unless the declared tensor has zero
	elements. The loader should verify that the raw archive record size matches the
	declared dtype/shape storage requirement before allocating, and should not
	synthesize attacker-sized tensors from empty records during deserialization.