Initial upload: 4-cond DD-PPO PointNav checkpoints + README
Browse files- README.md +66 -0
- blind/ckpt.10.pth +3 -0
- blind/ckpt.20.pth +3 -0
- blind/ckpt.25.pth +3 -0
- blind/ckpt.30.pth +3 -0
- blind/ckpt.34.pth +3 -0
- blind/ckpt.5.pth +3 -0
- coarse/ckpt.10.pth +3 -0
- coarse/ckpt.20.pth +3 -0
- coarse/ckpt.30.pth +3 -0
- coarse/ckpt.40.pth +3 -0
- coarse/ckpt.49.pth +3 -0
- foveated/ckpt.10.pth +3 -0
- foveated/ckpt.20.pth +3 -0
- foveated/ckpt.30.pth +3 -0
- foveated/ckpt.36.pth +3 -0
- uniform/ckpt.10.pth +3 -0
- uniform/ckpt.20.pth +3 -0
- uniform/ckpt.30.pth +3 -0
- uniform/ckpt.40.pth +3 -0
- uniform/ckpt.49.pth +3 -0
README.md
ADDED
|
@@ -0,0 +1,66 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Pre-trained checkpoints — *How sensor structure shapes the format of spatial memory in navigation agents*
|
| 2 |
+
|
| 3 |
+
DD-PPO PointNav agents trained in [Habitat](https://github.com/facebookresearch/habitat-lab) on a merged Gibson-0+ + Matterport3D-train pool (472 scenes), with one varied component: the visual sensor.
|
| 4 |
+
|
| 5 |
+
The four conditions span an encoder spatial-feature spectrum:
|
| 6 |
+
|
| 7 |
+
| Condition | Visual input | ResNet-18 spatial output | Frames | Final ckpt |
|
| 8 |
+
|---|---|---|---|---|
|
| 9 |
+
| `blind` | None | (no encoder) | 340M | `blind/ckpt.34.pth` |
|
| 10 |
+
| `coarse` | 48×48 RGB (uniform) | 1×1 feature map | 250M | `coarse/ckpt.49.pth` |
|
| 11 |
+
| `foveated` | 256×256 with eccentricity-dep. Gaussian blur ($\sigma_{\max}=8$, gaze fixed at image centre) | 8×8 feature map | 180M† | `foveated/ckpt.36.pth` |
|
| 12 |
+
| `uniform` | 256×256 RGB (no blur) | 8×8 feature map | 250M | `uniform/ckpt.49.pth` |
|
| 13 |
+
|
| 14 |
+
†`foveated` reached convergence at 180M (success 0.94, comparable to other sighted conds at 200M); training diverged with a $\sqrt{0}$-gradient instability past this point. The `ckpt.36` weights are NaN-free.
|
| 15 |
+
|
| 16 |
+
## Intermediate checkpoints
|
| 17 |
+
|
| 18 |
+
For reproducing the cross-training substitution figure (§4.2 of the paper), intermediate checkpoints are included:
|
| 19 |
+
|
| 20 |
+
- `blind/`: `ckpt.{5, 10, 20, 25, 30, 34}.pth` (50M, 100M, 200M, 250M, 300M, 340M frames)
|
| 21 |
+
- `coarse/`: `ckpt.{10, 20, 30, 40, 49}.pth` (50M, 100M, 150M, 200M, 245M frames)
|
| 22 |
+
- `foveated/`: `ckpt.{10, 20, 30, 36}.pth` (50M, 100M, 150M, 180M frames)
|
| 23 |
+
- `uniform/`: `ckpt.{10, 20, 30, 40, 49}.pth` (50M, 100M, 150M, 200M, 245M frames)
|
| 24 |
+
|
| 25 |
+
Frame counts are recorded in each `.pth`'s `extra_state['step']` field. Use 5M frames/ckpt for sighted conditions and 10M frames/ckpt for `blind`.
|
| 26 |
+
|
| 27 |
+
## Architecture
|
| 28 |
+
|
| 29 |
+
All four checkpoints share an identical backbone:
|
| 30 |
+
|
| 31 |
+
- **Recurrent module**: 3-layer LSTM, hidden dim 512
|
| 32 |
+
- **Non-visual sensor stack**: GPS, compass, distance-to-goal, goal-in-start-frame; each projected to 32-d and concatenated with a 32-d previous-action embedding (Wijmans et al. 2023 protocol)
|
| 33 |
+
- **Visual encoder** (where present): ResNet-18, output flattened
|
| 34 |
+
- **Policy / value heads**: linear projections from $\mathbf{h}_2$
|
| 35 |
+
- **Algorithm**: DD-PPO~\cite{wijmans2020ddppo}
|
| 36 |
+
|
| 37 |
+
## Loading
|
| 38 |
+
|
| 39 |
+
```python
|
| 40 |
+
import torch
|
| 41 |
+
from habitat_baselines import baseline_registry
|
| 42 |
+
|
| 43 |
+
ckpt = torch.load("blind/ckpt.34.pth", map_location="cpu", weights_only=False)
|
| 44 |
+
state_dict = ckpt["state_dict"]
|
| 45 |
+
config = ckpt["config"]
|
| 46 |
+
num_steps = ckpt["extra_state"]["step"] # frames trained
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
To reconstruct the policy and run inference, see the [companion code](https://github.com/alunxu/foveated-cog-map) `scripts/probing/collect.py` for an end-to-end example.
|
| 50 |
+
|
| 51 |
+
## Citation
|
| 52 |
+
|
| 53 |
+
If you use these checkpoints, please cite:
|
| 54 |
+
|
| 55 |
+
```bibtex
|
| 56 |
+
@inproceedings{xu2026sensor,
|
| 57 |
+
title={How Sensor Structure Shapes the Format of Spatial Memory in Navigation Agents},
|
| 58 |
+
author={Xu, Weilun and ...},
|
| 59 |
+
booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
|
| 60 |
+
year={2026}
|
| 61 |
+
}
|
| 62 |
+
```
|
| 63 |
+
|
| 64 |
+
## License
|
| 65 |
+
|
| 66 |
+
CC-BY-4.0. Free to use, modify, redistribute with attribution.
|
blind/ckpt.10.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5de201a6e995f1a2a184b641b4307a17c06fed6e06e8cb325f01b015dd1c5d50
|
| 3 |
+
size 22413164
|
blind/ckpt.20.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d15140494c7219e9b21487f74318f069d9513270708f288884ad5ba71dff9129
|
| 3 |
+
size 22413164
|
blind/ckpt.25.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:403376ce290ea49a0044e869b6bd6cd0e3c0b7690b96c34bc2c611283fba1e46
|
| 3 |
+
size 22413164
|
blind/ckpt.30.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:786fdaa5c84333f9e9b196c15b18e3e39a26a8c1c5b18f5fbeb01ffb5155f422
|
| 3 |
+
size 22413164
|
blind/ckpt.34.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:49e59e06ad69448b43d1d9d9a23a16e5c79b144dd403664c749e2996c0aad014
|
| 3 |
+
size 22413164
|
blind/ckpt.5.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:138c66b9d042d2f0359591243435e3900141250184e81498773ba22c7702796e
|
| 3 |
+
size 22412890
|
coarse/ckpt.10.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4d50a37711571e9f3478afec357aa633838dd0762a566b0fd29c384e478296d7
|
| 3 |
+
size 60926432
|
coarse/ckpt.20.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1656d3df70181c19f6f15e9ee3714ae1e37c97565e12730fe55d26786d774f80
|
| 3 |
+
size 60926432
|
coarse/ckpt.30.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2b8515e8984398c72f27b269829fcc74437e965bcef6353d89d8d57ae9da87da
|
| 3 |
+
size 60926432
|
coarse/ckpt.40.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:36dd163d66771450b393ade3980618f9c79fbbf4610108638fe9ad631ee4b153
|
| 3 |
+
size 60926432
|
coarse/ckpt.49.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:91c312c45f82da0da343422a0a71d46337818f9caf92bbe7825deafc108ce181
|
| 3 |
+
size 60926432
|
foveated/ckpt.10.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e1c0fa6eedb2e881336af6b400b9d220497f8551d44a6f141e9eb54a948e3ecc
|
| 3 |
+
size 43349140
|
foveated/ckpt.20.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:45bb48fa57e2c3bbe5b3715dc1c632e6b3aa58df7fe87787690d49e260e80b10
|
| 3 |
+
size 43349140
|
foveated/ckpt.30.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:496b69bdbbd3f8f6b73898459d86b910328ac2d255a8e288b32d432acb0d3a48
|
| 3 |
+
size 43349140
|
foveated/ckpt.36.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:13fc4a76d7857c4bd281f1e6c7f4050cc6f973ce186ed1887ebb926f16395b0e
|
| 3 |
+
size 43349140
|
uniform/ckpt.10.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:39c9f22f79abf816b101e7e801140dc8622d134c94f84ae959033f7e1cc0d6f7
|
| 3 |
+
size 43216352
|
uniform/ckpt.20.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cf4d315d1e431d076e9ff19da6510829b67d734df0243311232ab9df0558524d
|
| 3 |
+
size 43216352
|
uniform/ckpt.30.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:851ead587f33ac27c661baddda38af5d6945d523f9ed7aa9ba6852a6ee67968a
|
| 3 |
+
size 43216352
|
uniform/ckpt.40.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9bd7329636b7c53eee9165446581797e840b3f0cf2a4cf549b1c3476f606c854
|
| 3 |
+
size 43216352
|
uniform/ckpt.49.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:00428ba698a1360bc362451bd9c1e329ff0bcde4cf01c546d9bea300c3e8778e
|
| 3 |
+
size 43216352
|