Hironabe333's picture
Upload 9 files
f5dd0a9 verified
# NumPy np.load accepts .npz duplicate array names and resolves effective array without load-time warning
## Overview
A `.npz` file can contain duplicate logical array names, such as two `kernel.npy` members.
`np.load()` accepts this archive without a load-time warning. The `NpzFile.files` list
exposes the duplicate logical names (`['kernel', 'kernel']`), but key lookup through
`npz['kernel']` silently resolves to one effective member β€” the later ZIP entry β€”
without a duplicate-resolution warning.
## Environment
Python 3.10+, numpy >= 1.24.0
```bash
pip install -r requirements.txt
```
## Steps
### Step 1: Create .npz files
```bash
python3 create_npz.py .
```
Creates `benign_model.npz` (single `kernel.npy = [[1.0]]`) and `poc_model.npz`
(duplicate `kernel.npy` entries: first `[[1.0]]`, second `[[999.0]]`).
A creation-time warning is emitted by Python's `zipfile` module:
```
creation_warning: Duplicate name: 'kernel.npy'
```
This warning is visible to the archive creator only. A consumer loading a pre-existing
file does not see this warning.
### Step 2: Inspect duplicate logical array declarations
```bash
python3 inspect_npz.py poc_model.npz
```
Expected output:
```
NPZ_FILES=['kernel', 'kernel']
DUPLICATE_LOGICAL_NAME=kernel
WARNING_ON_NPZFILE_OPEN=False
ZIPINFO_COUNT_KERNEL=2
INSPECTOR_FIRST_MEMBER_OUTPUT=1.0
```
`np.load().files` exposes duplicate logical array names. `np.load()` opens the archive
without warning on the duplicate. Reading the first ZIP member by position returns `1.0`.
### Step 3: Key lookup β€” effective member resolved without warning
```bash
python3 reproduce.py poc_model.npz
```
Expected output:
```
RUNTIME_OUTPUT=999.0
WARNING_EMITTED_DURING_LOAD=False
OUTPUT_FLIP_CONFIRMED=true
```
`np.load(path)['kernel']` returns `999.0` β€” the effective member resolved via
`NameToInfo` last-write-wins. No warning is emitted.
## Result Summary
| | Output | Warning |
|---|---|---|
| `np.load().files` | `['kernel', 'kernel']` (both visible) | β€” |
| `np.load(path)['kernel']` (key) | **999.0** (effective) | None |
| `zipfile.infolist()[0]` (position) | 1.0 (earlier entry) | β€” |
The gap: `np.load()` exposes duplicate logical array names via `NpzFile.files` but
resolves key lookup to one effective member without a load-time duplicate-resolution warning.
## File Hashes (SHA-256)
```
4abbf240a83075fb3e5141225df1678a9409e8c47d998692f0c98248c423a03f create_npz.py
ee794da965c7aa6f1415c5e61f2001c5eae661f4bdee90c30705461c3ecf3448 inspect_npz.py
90126d16749d7ebd7492b81a44a315c63349343ae35bd0ddd9cf7b504202eec2 reproduce.py
08a40c0c85f36c57fe1ab1642855cff34f3e0c40338c4c8de533600a353ff643 requirements.txt
d94cb2768a7e9e3b95f5e6c393137a3418f8bb396db13d6c88cfb7550278f0e2 expected_output.txt
e4714af77fe60aa7a7e1d67dff6b14c07cefd41e08f435e636b1f045cc13c92d poc_model.npz
d4b38091fb654b85f300e2816be44672413f8a430aa854c27e98377dc25bf854 benign_model.npz
```