Update README.md
Browse files
README.md
CHANGED
|
@@ -12,19 +12,13 @@ tags:
|
|
| 12 |
- vit
|
| 13 |
- attention
|
| 14 |
datasets:
|
| 15 |
-
-
|
| 16 |
---
|
| 17 |
-
|
| 18 |
# ViT End-to-End Driving Model
|
| 19 |
-
|
| 20 |
Vision Transformer (ViT) adapted for end-to-end autonomous driving, trained on the Udacity self-driving car simulator for the bachelor's thesis: Dual-Axis Testing of Visual Robustness and Topological Generalization in Vision-based End-to-End Driving Models.
|
| 21 |
-
|
| 22 |
## Model Description
|
| 23 |
-
|
| 24 |
This model applies the Vision Transformer architecture to the end-to-end driving task. Instead of using convolutional layers, ViT splits the input image into patches and processes them using self-attention mechanisms, allowing the model to capture global dependencies in the visual input.
|
| 25 |
-
|
| 26 |
### Architecture
|
| 27 |
-
|
| 28 |
```
|
| 29 |
Input: RGB Image (224 × 224 × 3)
|
| 30 |
↓
|
|
@@ -44,36 +38,29 @@ MLP Head
|
|
| 44 |
↓
|
| 45 |
Output: [steering, throttle]
|
| 46 |
```
|
| 47 |
-
|
| 48 |
## Checkpoints
|
| 49 |
-
|
| 50 |
| Map | Checkpoint |
|
| 51 |
|-----|------------|
|
| 52 |
| GenRoads | `genroads_20251202-152358/` |
|
| 53 |
| Jungle | `jungle_20251201-132938/` |
|
| 54 |
-
|
| 55 |
### Files per Checkpoint
|
| 56 |
-
|
| 57 |
-
- `
|
| 58 |
-
- `
|
| 59 |
-
- `
|
| 60 |
-
- `loss_curve.png` — Visualization of training progress
|
| 61 |
-
|
| 62 |
## Citation
|
| 63 |
-
|
| 64 |
```bibtex
|
| 65 |
-
@thesis{
|
| 66 |
title={Dual-Axis Testing of Visual Robustness and Topological Generalization in Vision-based End-to-End Driving Models},
|
| 67 |
author={Igenbergs, Maxim},
|
| 68 |
school={Technical University of Munich},
|
| 69 |
-
year={
|
| 70 |
type={Bachelor's Thesis}
|
| 71 |
}
|
| 72 |
```
|
| 73 |
-
|
| 74 |
## Related
|
| 75 |
-
|
| 76 |
- [DAVE-2 Driving Model](https://huggingface.co/maxim-igenbergs/dave2)
|
| 77 |
- [DAVE-2-GRU Driving Model](https://huggingface.co/maxim-igenbergs/dave2-gru)
|
| 78 |
- [TCP Driving Model](https://huggingface.co/maxim-igenbergs/tcp-carla-repro)
|
| 79 |
-
- [
|
|
|
|
|
|
| 12 |
- vit
|
| 13 |
- attention
|
| 14 |
datasets:
|
| 15 |
+
- maxim-igenbergs/thesis-data
|
| 16 |
---
|
|
|
|
| 17 |
# ViT End-to-End Driving Model
|
|
|
|
| 18 |
Vision Transformer (ViT) adapted for end-to-end autonomous driving, trained on the Udacity self-driving car simulator for the bachelor's thesis: Dual-Axis Testing of Visual Robustness and Topological Generalization in Vision-based End-to-End Driving Models.
|
|
|
|
| 19 |
## Model Description
|
|
|
|
| 20 |
This model applies the Vision Transformer architecture to the end-to-end driving task. Instead of using convolutional layers, ViT splits the input image into patches and processes them using self-attention mechanisms, allowing the model to capture global dependencies in the visual input.
|
|
|
|
| 21 |
### Architecture
|
|
|
|
| 22 |
```
|
| 23 |
Input: RGB Image (224 × 224 × 3)
|
| 24 |
↓
|
|
|
|
| 38 |
↓
|
| 39 |
Output: [steering, throttle]
|
| 40 |
```
|
|
|
|
| 41 |
## Checkpoints
|
|
|
|
| 42 |
| Map | Checkpoint |
|
| 43 |
|-----|------------|
|
| 44 |
| GenRoads | `genroads_20251202-152358/` |
|
| 45 |
| Jungle | `jungle_20251201-132938/` |
|
|
|
|
| 46 |
### Files per Checkpoint
|
| 47 |
+
- `best_model.ckpt`: PyTorch model checkpoint
|
| 48 |
+
- `meta.json`: Training configuration and hyperparameters
|
| 49 |
+
- `history.csv`: Training/validation metrics per epoch
|
| 50 |
+
- `loss_curve.png`: Visualization of training progress
|
|
|
|
|
|
|
| 51 |
## Citation
|
|
|
|
| 52 |
```bibtex
|
| 53 |
+
@thesis{igenbergs2026dualaxis,
|
| 54 |
title={Dual-Axis Testing of Visual Robustness and Topological Generalization in Vision-based End-to-End Driving Models},
|
| 55 |
author={Igenbergs, Maxim},
|
| 56 |
school={Technical University of Munich},
|
| 57 |
+
year={2026},
|
| 58 |
type={Bachelor's Thesis}
|
| 59 |
}
|
| 60 |
```
|
|
|
|
| 61 |
## Related
|
|
|
|
| 62 |
- [DAVE-2 Driving Model](https://huggingface.co/maxim-igenbergs/dave2)
|
| 63 |
- [DAVE-2-GRU Driving Model](https://huggingface.co/maxim-igenbergs/dave2-gru)
|
| 64 |
- [TCP Driving Model](https://huggingface.co/maxim-igenbergs/tcp-carla-repro)
|
| 65 |
+
- [Training Data](https://huggingface.co/datasets/maxim-igenbergs/thesis-data)
|
| 66 |
+
- [Evaluation Runs](https://huggingface.co/datasets/maxim-igenbergs/thesis-runs)
|