Update Model Card
Browse files
README.md
CHANGED
|
@@ -1,3 +1,107 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
library_name: pytorch
|
| 4 |
+
pipeline_tag: other
|
| 5 |
+
tags:
|
| 6 |
+
- path-planning
|
| 7 |
+
- 3d
|
| 8 |
+
- voxels
|
| 9 |
+
- cnn
|
| 10 |
+
- transformer
|
| 11 |
+
- robotics
|
| 12 |
+
- pytorch
|
| 13 |
+
- inference
|
| 14 |
+
- Blender
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
### Voxel Path Finder (3D Voxel Path Planning with CNN+Transformer)
|
| 18 |
+
|
| 19 |
+
This repository hosts the weights and code for a neural network that plans paths in a 3D voxel grid (32×32×32). The model encodes the voxelized environment (obstacles + start + goal) with a 3D CNN, fuses learned position embeddings, and autoregressively generates a sequence of movement actions with a Transformer decoder.
|
| 20 |
+
|
| 21 |
+
- **Task**: 3D voxel path planning (generate action steps from start to goal)
|
| 22 |
+
- **Actions**: 0..5 → [FORWARD, BACK, LEFT, RIGHT, UP, DOWN]
|
| 23 |
+
- **Framework**: PyTorch
|
| 24 |
+
- **License**: MIT
|
| 25 |
+
|
| 26 |
+
### Model architecture (high level)
|
| 27 |
+
- Voxel encoder: 3D CNN with 3 conv blocks → 512-d environment feature
|
| 28 |
+
- Position encoder: learned embeddings over (x, y, z) → 64-d position feature
|
| 29 |
+
- Planner: Transformer decoder over action tokens with START/END special tokens
|
| 30 |
+
- Output: action token sequence; special tokens are excluded from final path
|
| 31 |
+
|
| 32 |
+
### Inputs and outputs
|
| 33 |
+
- **Input tensors**
|
| 34 |
+
- `voxel_data`: float tensor of shape `[1, 3, 32, 32, 32]`
|
| 35 |
+
Channels: [obstacles, start_mask, goal_mask]
|
| 36 |
+
- `positions`: long tensor of shape `[1, 2, 3]`
|
| 37 |
+
Format: `[[start_xyz, goal_xyz]]` with each coordinate in `[0, 31]`
|
| 38 |
+
- **Output**
|
| 39 |
+
- Long tensor `[1, T]` of action IDs (0..5), padded internally with END if needed
|
| 40 |
+
|
| 41 |
+
### Quickstart (inference)
|
| 42 |
+
Make sure this repo includes both `final_model.pth` (or `model_state_dict`) and `pathfinding_nn.py`.
|
| 43 |
+
|
| 44 |
+
```python
|
| 45 |
+
import torch, numpy as np
|
| 46 |
+
from huggingface_hub import hf_hub_download
|
| 47 |
+
import importlib.util, sys
|
| 48 |
+
|
| 49 |
+
REPO_ID = "c1tr0n75/VoxelPathFinder"
|
| 50 |
+
Github = https://github.com/c1tr0n75/VoxelPathFinder
|
| 51 |
+
# Download files from the Hub
|
| 52 |
+
pth_path = hf_hub_download(repo_id=REPO_ID, filename="final_model.pth")
|
| 53 |
+
py_path = hf_hub_download(repo_id=REPO_ID, filename="pathfinding_nn.py")
|
| 54 |
+
|
| 55 |
+
# Dynamically import the model code
|
| 56 |
+
spec = importlib.util.spec_from_file_location("pathfinding_nn", py_path)
|
| 57 |
+
mod = importlib.util.module_from_spec(spec)
|
| 58 |
+
spec.loader.exec_module(mod)
|
| 59 |
+
PathfindingNetwork = mod.PathfindingNetwork
|
| 60 |
+
create_voxel_input = mod.create_voxel_input
|
| 61 |
+
|
| 62 |
+
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
| 63 |
+
model = PathfindingNetwork().to(device).eval()
|
| 64 |
+
|
| 65 |
+
# Load weights (supports either a plain state_dict or {'model_state_dict': ...})
|
| 66 |
+
ckpt = torch.load(pth_path, map_location=device)
|
| 67 |
+
state = ckpt["model_state_dict"] if isinstance(ckpt, dict) and "model_state_dict" in ckpt else ckpt
|
| 68 |
+
model.load_state_dict(state)
|
| 69 |
+
|
| 70 |
+
# Build a random test environment
|
| 71 |
+
voxel_dim = model.voxel_dim # (32, 32, 32)
|
| 72 |
+
D, H, W = voxel_dim
|
| 73 |
+
obstacle_prob = 0.2
|
| 74 |
+
obstacles = (np.random.rand(D, H, W) < obstacle_prob).astype(np.float32)
|
| 75 |
+
free = np.argwhere(obstacles == 0)
|
| 76 |
+
assert len(free) >= 2, "Not enough free cells; lower obstacle_prob"
|
| 77 |
+
s_idx, g_idx = np.random.choice(len(free), size=2, replace=False)
|
| 78 |
+
start = tuple(free[s_idx])
|
| 79 |
+
goal = tuple(free[g_idx])
|
| 80 |
+
|
| 81 |
+
voxel_np = create_voxel_input(obstacles, start, goal, voxel_dim=voxel_dim) # (3,32,32,32)
|
| 82 |
+
voxel = torch.from_numpy(voxel_np).float().unsqueeze(0).to(device) # (1,3,32,32,32)
|
| 83 |
+
pos = torch.tensor([[start, goal]], dtype=torch.long, device=device) # (1,2,3)
|
| 84 |
+
|
| 85 |
+
with torch.no_grad():
|
| 86 |
+
actions = model(voxel, pos)[0].tolist()
|
| 87 |
+
|
| 88 |
+
ACTION_NAMES = ['FORWARD', 'BACK', 'LEFT', 'RIGHT', 'UP', 'DOWN']
|
| 89 |
+
decoded = [ACTION_NAMES[a] for a in actions if 0 <= a < 6]
|
| 90 |
+
print(f"Start: {start} | Goal: {goal}")
|
| 91 |
+
print(f"Generated {len(decoded)} steps (first 30): {decoded[:30]}")
|
| 92 |
+
```
|
| 93 |
+
|
| 94 |
+
### Intended uses and limitations
|
| 95 |
+
- **Intended**: Research and demo of 3D voxel path planning; educational examples; quick inference in CPU/GPU environments.
|
| 96 |
+
- **Not intended**: Safety-critical navigation without additional validation; large scenes beyond 32³ without retraining; Blender-based generation on hosted environments.
|
| 97 |
+
- The generated actions may not yield collision-free paths in complex scenes; downstream validation is recommended.
|
| 98 |
+
|
| 99 |
+
### Training data and procedure
|
| 100 |
+
- Synthetic voxel environments were generated (in-project tools leverage Blender for dataset creation and visualization).
|
| 101 |
+
- Model trained to predict action sequences from start to goal; Loss includes cross-entropy over actions plus auxiliary turn/collision components.
|
| 102 |
+
|
| 103 |
+
### Ethical considerations
|
| 104 |
+
- This is a research model for toy 3D grids. It is not validated for real-world navigation where safety, environment dynamics, and constraints apply.
|
| 105 |
+
|
| 106 |
+
### Citation
|
| 107 |
+
If you use this model, please cite this repository:
|