File size: 4,449 Bytes
ba4ed85 32f0f10 ba4ed85 32f0f10 ba4ed85 32f0f10 ba4ed85 32f0f10 ba4ed85 502a473 ba4ed85 32f0f10 ba4ed85 502a473 ba4ed85 502a473 ba4ed85 fccad62 ba4ed85 502a473 ba4ed85 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
---
language: en
tags:
- deep-q-network
- reinforcement-learning
- pathfinding
- floorplan
license: apache-2.0
datasets:
- custom
metrics:
- average_reward
- success_rate
---
# Deep Q-Network for Floorplan Navigation
## Model Description
This model is a Deep Q-Network (DQN) designed to find the most efficient path through a floorplan without hitting obstacles. The model combines traditional pathfinding algorithms with reinforcement learning for optimal performance.
## Model Architecture
The model is a fully connected neural network with the following architecture:
- Input Layer: Flattened grid representation of the floorplan
- Hidden Layers: Two hidden layers with 64 units each and ReLU activation
- Output Layer: Four units representing the possible actions (up, down, left, right)
## Training
The model was trained using a hybrid approach:
1. **A(*) Algorithm**: Initially, the A* algorithm was used to find the shortest path in a static environment.
2. **Reinforcement Learning**: The DQN was trained with guidance from the A* path to improve efficiency and adaptability.
### Hyperparameters
- Learning Rate: 0.001
- Batch Size: 64
- Gamma (Discount Factor): 0.99
- Target Update Frequency: Every 100 episodes
- Number of Episodes: 50
## Checkpoints
Checkpoints are saved during training for convenience:
- `checkpoint_11.pth.tar`: After 11 episodes
- `checkpoint_21.pth.tar`: After 21 episodes
- `checkpoint_31.pth.tar`: After 31 episodes
- `checkpoint_41.pth.tar`: After 41 episodes
## Usage
To use this model, load the saved state dictionary and initialize the DQN with the same architecture. The model can then be used to navigate a floorplan and find the most efficient path to the target.
### Example Code
```python
import torch
# Define the DQN class (same as in the training script)
class DQN(nn.Module):
def __init__(self, input_size, hidden_sizes, output_size):
super(DQN, self).__init__()
self.input_size = input_size
self.hidden_sizes = hidden_sizes
self.output_size = output_size
self.fc_layers = nn.ModuleList()
prev_size = input_size
for size in hidden_sizes:
self.fc_layers.append(nn.Linear(prev_size, size))
prev_size = size
self.output_layer = nn.Linear(prev_size, output_size)
def forward(self, x):
if len(x.shape) > 2:
x = x.view(x.size(0), -1)
for layer in self.fc_layers:
x = F.relu(layer(x))
x = self.output_layer(x)
return x
# Load the model
input_size = 100 # 10x10 grid flattened
hidden_sizes = [64, 64]
output_size = 4
model = DQN(input_size, hidden_sizes, output_size)
model.load_state_dict(torch.load('dqn_model.pth'))
model.eval()
# Use the model for inference (example state)
state = ... # Define your state here
with torch.no_grad():
action = model(torch.tensor(state, dtype=torch.float32).unsqueeze(0)).argmax().item()
```
## Training Script
The training script train.py is included in the repository for those who wish to reproduce the training process or continue training from a specific checkpoint.
### Training Instructions
- Clone the repository.
- Ensure you have the necessary dependencies installed.
- Run the training script:
```
bash
Copy code
python train.py
```
To continue training from a checkpoint, modify the script to load the checkpoint before training.
## Evaluation
The model was evaluated based on:
- Average Reward: The mean reward over several episodes
- Success Rate: The proportion of episodes where the agent successfully reached the target
## Initial Evaluation Results
- Average Reward: 8.84
- Success Rate: 1.0
## Limitations
- The model's performance can be influenced by the complexity of the floorplan and the density of obstacles.
- It requires a grid-based representation of the environment for accurate navigation.
## Acknowledgements
This project leverages the power of reinforcement learning combined with traditional pathfinding algorithms to navigate complex environments efficiently.
## License
This model is licensed under the Apache 2.0 License.
## Citation
If you use this model in your research, please cite it as follows:
```
@misc{jones2024dqnfloorplan,
author = {Christopher Jones},
title = {Deep Q-Network for Floorplan Navigation},
year = {2024},
howpublished = {\url{https://huggingface.co/cajcodes/dqn-floorplan-navigator}},
note = {Accessed: YYYY-MM-DD}
}
```
|