--- license: mit library_name: pytorch tags: - autonomous-driving - end-to-end - imitation-learning - self-driving - udacity - vision - transformer - vit - attention datasets: - maxim-igenbergs/thesis-data --- # ViT End-to-End Driving Model Vision Transformer (ViT) adapted for end-to-end autonomous driving, trained on the Udacity self-driving car simulator for the bachelor's thesis: Dual-Axis Testing of Visual Robustness and Topological Generalization in Vision-based End-to-End Driving Models. ## Model Description This model applies the Vision Transformer architecture to the end-to-end driving task. Instead of using convolutional layers, ViT splits the input image into patches and processes them using self-attention mechanisms, allowing the model to capture global dependencies in the visual input. ### Architecture ``` Input: RGB Image (224 × 224 × 3) ↓ Patch Embedding (16 × 16 patches) ↓ [CLS] Token + Positional Embedding ↓ Transformer Encoder Blocks (×L): - Multi-Head Self-Attention - Layer Normalization - MLP (Feed-Forward) - Layer Normalization ↓ [CLS] Token Output ↓ MLP Head ↓ Output: [steering, throttle] ``` ## Checkpoints | Map | Checkpoint | |-----|------------| | GenRoads | `genroads_20251202-152358/` | | Jungle | `jungle_20251201-132938/` | ### Files per Checkpoint - `best_model.ckpt`: PyTorch model checkpoint - `meta.json`: Training configuration and hyperparameters - `history.csv`: Training/validation metrics per epoch - `loss_curve.png`: Visualization of training progress ## Citation ```bibtex @thesis{igenbergs2026dualaxis, title={Dual-Axis Testing of Visual Robustness and Topological Generalization in Vision-based End-to-End Driving Models}, author={Igenbergs, Maxim}, school={Technical University of Munich}, year={2026}, type={Bachelor's Thesis} } ``` ## Related - [DAVE-2 Driving Model](https://huggingface.co/maxim-igenbergs/dave2) - [DAVE-2-GRU Driving Model](https://huggingface.co/maxim-igenbergs/dave2-gru) - [TCP Driving Model](https://huggingface.co/maxim-igenbergs/tcp-carla-repro) - [Training Data](https://huggingface.co/datasets/maxim-igenbergs/thesis-data) - [Evaluation Runs](https://huggingface.co/datasets/maxim-igenbergs/thesis-runs)