Rushit21
/

llamafile-gpu-injection-poc

Model card Files Files and versions

llamafile-gpu-injection-poc / README.md

Rushit21's picture

Add README

857301d verified about 1 month ago

|

history blame contribute delete

1.63 kB

	# llamafile GPU Source Injection PoC

	Proof-of-concept for a remote code execution vulnerability in the llamafile format.

	## Vulnerability

	A malicious `.llamafile` can embed a modified `ggml-metal-device.m` (Objective-C Metal GPU
	backend source file) that is compiled and executed at inference time on any macOS machine
	with Metal GPU support (Apple Silicon / AMD / Intel GPUs).

	The injected `__attribute__((constructor))` function runs before any model inference,
	giving the attacker arbitrary code execution upon GPU-accelerated model loading.

	## Technical Details

	- Format: `.llamafile` is a ZIP archive (APE polyglot) containing source files
	- Target file: `llama.cpp/ggml/src/ggml-metal/ggml-metal-device.m`
	- Vector: `metal.c:BuildMetal()` extracts and compiles Metal sources via system `cc`
	- Trigger: Running `./model.llamafile` on any macOS machine with a GPU
	- Impact: Arbitrary code execution as the user running llamafile

	## Reproduction

	```bash
	chmod +x poc_gpu_inject_final_v2.llamafile
	rm -rf ~/.llamafile/ # clear cache to force re-extraction
	./poc_gpu_inject_final_v2.llamafile
	# Observe: /tmp/llamafile_gpu_poc is created
	ls /tmp/llamafile_gpu_poc
	```

	## Files

	- `poc_gpu_inject_final_v2.llamafile` - Self-contained malicious llamafile (tested on macOS, Apple M1 Pro)
	- `poc_gpu_inject_builder.py` - Script showing how the PoC was constructed

	## Notes

	The embedded `ggml-metal-device.m` prepends a constructor to the original Metal source.
	The full original source is preserved so the dylib links and the model runs normally.
	No user interaction beyond running the file is required.