File size: 4,449 Bytes
ba4ed85
 
 
 
 
 
32f0f10
ba4ed85
 
 
 
 
 
 
 
32f0f10
ba4ed85
 
 
32f0f10
ba4ed85
 
 
 
 
 
 
 
 
 
 
32f0f10
ba4ed85
 
 
 
 
 
 
 
 
502a473
 
 
 
 
 
 
 
ba4ed85
 
32f0f10
ba4ed85
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
502a473
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ba4ed85
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
502a473
 
 
 
ba4ed85
 
 
 
fccad62
ba4ed85
 
 
502a473
ba4ed85
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
language: en
tags:
- deep-q-network
- reinforcement-learning
- pathfinding
- floorplan
license: apache-2.0
datasets:
- custom
metrics:
- average_reward
- success_rate
---

# Deep Q-Network for Floorplan Navigation

## Model Description

This model is a Deep Q-Network (DQN) designed to find the most efficient path through a floorplan without hitting obstacles. The model combines traditional pathfinding algorithms with reinforcement learning for optimal performance.

## Model Architecture

The model is a fully connected neural network with the following architecture:
- Input Layer: Flattened grid representation of the floorplan
- Hidden Layers: Two hidden layers with 64 units each and ReLU activation
- Output Layer: Four units representing the possible actions (up, down, left, right)

## Training

The model was trained using a hybrid approach:
1. **A(*) Algorithm**: Initially, the A* algorithm was used to find the shortest path in a static environment.
2. **Reinforcement Learning**: The DQN was trained with guidance from the A* path to improve efficiency and adaptability.

### Hyperparameters
- Learning Rate: 0.001
- Batch Size: 64
- Gamma (Discount Factor): 0.99
- Target Update Frequency: Every 100 episodes
- Number of Episodes: 50

## Checkpoints

Checkpoints are saved during training for convenience:
- `checkpoint_11.pth.tar`: After 11 episodes
- `checkpoint_21.pth.tar`: After 21 episodes
- `checkpoint_31.pth.tar`: After 31 episodes
- `checkpoint_41.pth.tar`: After 41 episodes

## Usage

To use this model, load the saved state dictionary and initialize the DQN with the same architecture. The model can then be used to navigate a floorplan and find the most efficient path to the target.

### Example Code

```python
import torch

# Define the DQN class (same as in the training script)
class DQN(nn.Module):
    def __init__(self, input_size, hidden_sizes, output_size):
        super(DQN, self).__init__()
        self.input_size = input_size
        self.hidden_sizes = hidden_sizes
        self.output_size = output_size

        self.fc_layers = nn.ModuleList()
        prev_size = input_size
        for size in hidden_sizes:
            self.fc_layers.append(nn.Linear(prev_size, size))
            prev_size = size
        self.output_layer = nn.Linear(prev_size, output_size)

    def forward(self, x):
        if len(x.shape) > 2:
            x = x.view(x.size(0), -1)
        for layer in self.fc_layers:
            x = F.relu(layer(x))
        x = self.output_layer(x)
        return x

# Load the model
input_size = 100  # 10x10 grid flattened
hidden_sizes = [64, 64]
output_size = 4
model = DQN(input_size, hidden_sizes, output_size)
model.load_state_dict(torch.load('dqn_model.pth'))
model.eval()

# Use the model for inference (example state)
state = ...  # Define your state here
with torch.no_grad():
    action = model(torch.tensor(state, dtype=torch.float32).unsqueeze(0)).argmax().item()
```
## Training Script

The training script train.py is included in the repository for those who wish to reproduce the training process or continue training from a specific checkpoint.

### Training Instructions
- Clone the repository.
- Ensure you have the necessary dependencies installed.
- Run the training script:
```
bash
Copy code
python train.py
```
To continue training from a checkpoint, modify the script to load the checkpoint before training.

## Evaluation

The model was evaluated based on:

- Average Reward: The mean reward over several episodes
- Success Rate: The proportion of episodes where the agent successfully reached the target

## Initial Evaluation Results
- Average Reward: 8.84
- Success Rate: 1.0

## Limitations

- The model's performance can be influenced by the complexity of the floorplan and the density of obstacles.
- It requires a grid-based representation of the environment for accurate navigation.

## Acknowledgements

This project leverages the power of reinforcement learning combined with traditional pathfinding algorithms to navigate complex environments efficiently.

## License

This model is licensed under the Apache 2.0 License.

## Citation

If you use this model in your research, please cite it as follows:
```
@misc{jones2024dqnfloorplan,
author = {Christopher Jones},
title = {Deep Q-Network for Floorplan Navigation},
year = {2024},
howpublished = {\url{https://huggingface.co/cajcodes/dqn-floorplan-navigator}},
note = {Accessed: YYYY-MM-DD}
}
```