File size: 4,573 Bytes
90270bd 4c58b41 90270bd 4c58b41 8e21445 90270bd 6fc6b8a 90270bd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
---
license: mit
---
# TempVerseFormer - Pre-trained Models
[](https://huggingface.co/LKyluk/TempVerseFormer)
[](https://github.com/leo27heady/TempVerseFormer)
[](https://github.com/leo27heady/simple-shape-dataset-toolbox)
[](https://wandb.ai/leo27heady/pipe-transformer/reports/TempVerseFormer-Training-Logs--VmlldzoxMTg3OTQ3NQ)
This repository hosts pre-trained models for **TempVerseFormer: Temporal Modeling with Reversible Transformers**, a novel architecture introduced in the research article **"Temporal Modeling with Reversible Transformers"**.
These models are designed for memory-efficient temporal sequence prediction, particularly for tasks involving continuous and evolving data streams. They are trained on a synthetic dataset of rotating 2D shapes, designed to evaluate temporal modeling capabilities in a controlled environment.
## Models Included
This repository contains pre-trained weights for the following models, as described in the research article:
* **TempFormer (Vanilla-Transformer):** A standard Vanilla Transformer architecture with temporal chaining, serving as a baseline to compare against TempVerseFormer.
* **TempVerseFormer (Rev-Transformer):** The core Reversible Temporal Transformer architecture, leveraging reversible blocks and time-agnostic backpropagation for memory efficiency.
* **Standard Transformer (Pipe-Transformer):** A standard Transformer model that predicts only one next element at once.
* **LSTM:** A Long Short-Term Memory network, representing a traditional recurrent sequence modeling approach.
* **VAE Models:** Variational Autoencoder (VAE) models used for encoding and decoding images to and from a latent space:
* **Vanilla VAE:** Standard VAE architecture.
Each model checkpoint is provided as a `.pt` file containing the `state_dict` of the trained model.
* For all of the models checkpoints available for different training configurations (e.g., with/without temporal patterns).*
## Intended Use
These pre-trained models are intended for:
* **Research:** Facilitating further research in memory-efficient temporal modeling, reversible architectures, and time-agnostic backpropagation.
* **Benchmarking:** Providing baselines for comparison with new temporal sequence modeling architectures.
* **Fine-tuning:** Serving as a starting point for fine-tuning on new datasets or for related temporal prediction tasks.
* **Demonstration:** Illustrating the capabilities of TempVerseFormer and its memory efficiency advantages.
**Please note:** These models were primarily trained and evaluated on a synthetic dataset of rotating shapes. While they demonstrate promising results in this controlled environment, their performance on real-world datasets may vary and require further evaluation and fine-tuning.
## How to Use
* **Configuration:** Ensure you use the correct model configuration (e.g., `config_rev_transformer`, `config_vae`) that corresponds to the pre-trained checkpoint you are loading. You can find example configurations in the `configs/train` directory of the [GitHub repository](https://github.com/leo27heady/TempVerseFormer).
* **Data Preprocessing:** Input data should be preprocessed in the same way as the training data. Refer to the `ShapeDataset` class in the GitHub repository for details on data loading and preprocessing.
* **Device:** Load models and data onto the appropriate device (`'cpu'` or `'cuda'`).
* **Evaluation Mode:** Remember to set models to `.eval()` mode for inference.
For more detailed usage examples and specific code for different models and tasks, please refer to the [GitHub repository](https://github.com/leo27heady/TempVerseFormer) and the `train.py`, `eval.py`, and `memory_test.py` scripts.
## Dataset
The models were trained on a synthetic dataset of rotating 2D shapes generated using the [Simple Shape Dataset Toolbox](https://github.com/leo27heady/simple-shape-dataset-toolbox). This toolbox allows for procedural generation of customizable shape datasets.
## License
These pre-trained models are released under the [**MIT**] license.
|