| # llamafile GPU Source Injection PoC |
|
|
| Proof-of-concept for a remote code execution vulnerability in the llamafile format. |
|
|
| ## Vulnerability |
|
|
| A malicious `.llamafile` can embed a modified `ggml-metal-device.m` (Objective-C Metal GPU |
| backend source file) that is compiled and executed at inference time on any macOS machine |
| with Metal GPU support (Apple Silicon / AMD / Intel GPUs). |
|
|
| The injected `__attribute__((constructor))` function runs before any model inference, |
| giving the attacker arbitrary code execution upon GPU-accelerated model loading. |
|
|
| ## Technical Details |
|
|
| - **Format**: `.llamafile` is a ZIP archive (APE polyglot) containing source files |
| - **Target file**: `llama.cpp/ggml/src/ggml-metal/ggml-metal-device.m` |
| - **Vector**: `metal.c:BuildMetal()` extracts and compiles Metal sources via system `cc` |
| - **Trigger**: Running `./model.llamafile` on any macOS machine with a GPU |
| - **Impact**: Arbitrary code execution as the user running llamafile |
|
|
| ## Reproduction |
|
|
| ```bash |
| chmod +x poc_gpu_inject_final_v2.llamafile |
| rm -rf ~/.llamafile/ # clear cache to force re-extraction |
| ./poc_gpu_inject_final_v2.llamafile |
| # Observe: /tmp/llamafile_gpu_poc is created |
| ls /tmp/llamafile_gpu_poc |
| ``` |
|
|
| ## Files |
|
|
| - `poc_gpu_inject_final_v2.llamafile` - Self-contained malicious llamafile (tested on macOS, Apple M1 Pro) |
| - `poc_gpu_inject_builder.py` - Script showing how the PoC was constructed |
|
|
| ## Notes |
|
|
| The embedded `ggml-metal-device.m` prepends a constructor to the original Metal source. |
| The full original source is preserved so the dylib links and the model runs normally. |
| No user interaction beyond running the file is required. |
|
|