01data-ai's picture
Upload 14 files
858df29 verified
metadata
library_name: gguf
tags:
  - gguf
  - llama-cpp
  - security
  - proof-of-concept
  - parser-divergence
  - model-integrity
  - huntr
  - protectai
license: other

GGUF-PY-F004 Tensor Offset Aliasing

Payload repository for Huntr / ProtectAI triage.

Finding

Parser Divergence: Python GGUFReader accepts non-sequential tensor offsets causing silent tensor data aliasing while native llama.cpp rejects the same GGUF.

Primary PoC Model

poc_GGUF-PY-F004_alias.gguf

Primary PoC SHA256

484eb1c0a3583b7af469d977d33fd4e39d6ae3b92897d1f6382454b9a7daf9de

Proof Script

prove_f004_live_repo.py

Confirmed Behavior

The crafted GGUF file contains two tensors:

  • tensor_a
  • tensor_b

Both tensors declare the same tensor data offset:

tensor_a: data_offset = 0
tensor_b: data_offset = 0

Python GGUFReader accepts the file and loads both tensor names, but both tensors read from the same underlying bytes.

Observed Python result:

tensor_a_actual = [1.100000023841858, 2.200000047683716, 3.299999952316284, 4.400000095367432]
tensor_b_actual = [1.100000023841858, 2.200000047683716, 3.299999952316284, 4.400000095367432]
tensor_b_expected = [5.5, 6.6, 7.7, 8.8]
alias_confirmed = True
tensor_b_wrong_data = True
MARKER_GGUF_PY_F004_ALIAS_CONFIRMED_LIVE_REPO

Native llama-gguf rejects the same file:

gguf_init_from_file_ptr: tensor 'tensor_b' has offset 0, expected 32
gguf_init_from_file_ptr: failed to read tensor data
EXIT_CODE=134
Impact

This demonstrates a parser divergence and model-file data integrity failure.

A Python-based GGUF scanner, converter, validator, or ingestion pipeline using GGUFReader can accept a malformed GGUF file that native llama.cpp rejects, while silently mapping one tensor name to another tensor's data.

This is not a code execution claim. The confirmed impact is silent tensor data aliasing / model integrity failure in Python GGUF processing.

Key Evidence Files
PYTHON/PYTHON_LIVE_REPO_OUTPUT.txt
NATIVE/NATIVE_LIVE_REPO_OUTPUT.txt
RAW_OUTPUT/final_repro_output.txt
RAW_OUTPUT/key_markers.txt
SOURCE/gguf_reader_tensor_offset_excerpt.txt
SOURCE/gguf_cpp_tensor_offset_validation_excerpt.txt
SOURCE/live_repo_status_and_diff_check.txt
ENVIRONMENT/ENVIRONMENT.txt
ENVIRONMENT/ENVIRONMENT_LIVE_REPO_CONFIRMED.txt
SHA256SUMS.txt
Scope

Confirmed against:

Repository: ggerganov/llama.cpp
Commit: a290ce626663dae1d54f70bce3ca6d8f67aab62f
Native version: 9046 (a290ce626)
Python component: gguf-py/gguf/gguf_reader.py
Affected function: GGUFReader._build_tensors()