| # NumPy np.load accepts .npz duplicate array names and resolves effective array without load-time warning |
|
|
| ## Overview |
|
|
| A `.npz` file can contain duplicate logical array names, such as two `kernel.npy` members. |
| `np.load()` accepts this archive without a load-time warning. The `NpzFile.files` list |
| exposes the duplicate logical names (`['kernel', 'kernel']`), but key lookup through |
| `npz['kernel']` silently resolves to one effective member β the later ZIP entry β |
| without a duplicate-resolution warning. |
|
|
| ## Environment |
|
|
| Python 3.10+, numpy >= 1.24.0 |
|
|
| ```bash |
| pip install -r requirements.txt |
| ``` |
|
|
| ## Steps |
|
|
| ### Step 1: Create .npz files |
|
|
| ```bash |
| python3 create_npz.py . |
| ``` |
|
|
| Creates `benign_model.npz` (single `kernel.npy = [[1.0]]`) and `poc_model.npz` |
| (duplicate `kernel.npy` entries: first `[[1.0]]`, second `[[999.0]]`). |
|
|
| A creation-time warning is emitted by Python's `zipfile` module: |
| ``` |
| creation_warning: Duplicate name: 'kernel.npy' |
| ``` |
| This warning is visible to the archive creator only. A consumer loading a pre-existing |
| file does not see this warning. |
|
|
| ### Step 2: Inspect duplicate logical array declarations |
|
|
| ```bash |
| python3 inspect_npz.py poc_model.npz |
| ``` |
|
|
| Expected output: |
| ``` |
| NPZ_FILES=['kernel', 'kernel'] |
| DUPLICATE_LOGICAL_NAME=kernel |
| WARNING_ON_NPZFILE_OPEN=False |
| ZIPINFO_COUNT_KERNEL=2 |
| INSPECTOR_FIRST_MEMBER_OUTPUT=1.0 |
| ``` |
|
|
| `np.load().files` exposes duplicate logical array names. `np.load()` opens the archive |
| without warning on the duplicate. Reading the first ZIP member by position returns `1.0`. |
|
|
| ### Step 3: Key lookup β effective member resolved without warning |
|
|
| ```bash |
| python3 reproduce.py poc_model.npz |
| ``` |
|
|
| Expected output: |
| ``` |
| RUNTIME_OUTPUT=999.0 |
| WARNING_EMITTED_DURING_LOAD=False |
| OUTPUT_FLIP_CONFIRMED=true |
| ``` |
|
|
| `np.load(path)['kernel']` returns `999.0` β the effective member resolved via |
| `NameToInfo` last-write-wins. No warning is emitted. |
|
|
| ## Result Summary |
|
|
| | | Output | Warning | |
| |---|---|---| |
| | `np.load().files` | `['kernel', 'kernel']` (both visible) | β | |
| | `np.load(path)['kernel']` (key) | **999.0** (effective) | None | |
| | `zipfile.infolist()[0]` (position) | 1.0 (earlier entry) | β | |
|
|
| The gap: `np.load()` exposes duplicate logical array names via `NpzFile.files` but |
| resolves key lookup to one effective member without a load-time duplicate-resolution warning. |
|
|
| ## File Hashes (SHA-256) |
|
|
| ``` |
| 4abbf240a83075fb3e5141225df1678a9409e8c47d998692f0c98248c423a03f create_npz.py |
| ee794da965c7aa6f1415c5e61f2001c5eae661f4bdee90c30705461c3ecf3448 inspect_npz.py |
| 90126d16749d7ebd7492b81a44a315c63349343ae35bd0ddd9cf7b504202eec2 reproduce.py |
| 08a40c0c85f36c57fe1ab1642855cff34f3e0c40338c4c8de533600a353ff643 requirements.txt |
| d94cb2768a7e9e3b95f5e6c393137a3418f8bb396db13d6c88cfb7550278f0e2 expected_output.txt |
| e4714af77fe60aa7a7e1d67dff6b14c07cefd41e08f435e636b1f045cc13c92d poc_model.npz |
| d4b38091fb654b85f300e2816be44672413f8a430aa854c27e98377dc25bf854 benign_model.npz |
| ``` |
|
|