|
|
--- |
|
|
license: mit |
|
|
--- |
|
|
|
|
|
# TempVerseFormer - Pre-trained Models |
|
|
|
|
|
[](https://huggingface.co/LKyluk/TempVerseFormer) |
|
|
[](https://github.com/leo27heady/TempVerseFormer) |
|
|
[](https://github.com/leo27heady/simple-shape-dataset-toolbox) |
|
|
[](https://wandb.ai/leo27heady/pipe-transformer/reports/TempVerseFormer-Training-Logs--VmlldzoxMTg3OTQ3NQ) |
|
|
|
|
|
This repository hosts pre-trained models for **TempVerseFormer: Temporal Modeling with Reversible Transformers**, a novel architecture introduced in the research article **"Temporal Modeling with Reversible Transformers"**. |
|
|
|
|
|
These models are designed for memory-efficient temporal sequence prediction, particularly for tasks involving continuous and evolving data streams. They are trained on a synthetic dataset of rotating 2D shapes, designed to evaluate temporal modeling capabilities in a controlled environment. |
|
|
|
|
|
## Models Included |
|
|
|
|
|
This repository contains pre-trained weights for the following models, as described in the research article: |
|
|
|
|
|
* **TempFormer (Vanilla-Transformer):** A standard Vanilla Transformer architecture with temporal chaining, serving as a baseline to compare against TempVerseFormer. |
|
|
* **TempVerseFormer (Rev-Transformer):** The core Reversible Temporal Transformer architecture, leveraging reversible blocks and time-agnostic backpropagation for memory efficiency. |
|
|
* **Standard Transformer (Pipe-Transformer):** A standard Transformer model that predicts only one next element at once. |
|
|
* **LSTM:** A Long Short-Term Memory network, representing a traditional recurrent sequence modeling approach. |
|
|
* **VAE Models:** Variational Autoencoder (VAE) models used for encoding and decoding images to and from a latent space: |
|
|
* **Vanilla VAE:** Standard VAE architecture. |
|
|
|
|
|
Each model checkpoint is provided as a `.pt` file containing the `state_dict` of the trained model. |
|
|
* For all of the models checkpoints available for different training configurations (e.g., with/without temporal patterns).* |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
These pre-trained models are intended for: |
|
|
|
|
|
* **Research:** Facilitating further research in memory-efficient temporal modeling, reversible architectures, and time-agnostic backpropagation. |
|
|
* **Benchmarking:** Providing baselines for comparison with new temporal sequence modeling architectures. |
|
|
* **Fine-tuning:** Serving as a starting point for fine-tuning on new datasets or for related temporal prediction tasks. |
|
|
* **Demonstration:** Illustrating the capabilities of TempVerseFormer and its memory efficiency advantages. |
|
|
|
|
|
**Please note:** These models were primarily trained and evaluated on a synthetic dataset of rotating shapes. While they demonstrate promising results in this controlled environment, their performance on real-world datasets may vary and require further evaluation and fine-tuning. |
|
|
|
|
|
|
|
|
## How to Use |
|
|
|
|
|
* **Configuration:** Ensure you use the correct model configuration (e.g., `config_rev_transformer`, `config_vae`) that corresponds to the pre-trained checkpoint you are loading. You can find example configurations in the `configs/train` directory of the [GitHub repository](https://github.com/leo27heady/TempVerseFormer). |
|
|
* **Data Preprocessing:** Input data should be preprocessed in the same way as the training data. Refer to the `ShapeDataset` class in the GitHub repository for details on data loading and preprocessing. |
|
|
* **Device:** Load models and data onto the appropriate device (`'cpu'` or `'cuda'`). |
|
|
* **Evaluation Mode:** Remember to set models to `.eval()` mode for inference. |
|
|
|
|
|
For more detailed usage examples and specific code for different models and tasks, please refer to the [GitHub repository](https://github.com/leo27heady/TempVerseFormer) and the `train.py`, `eval.py`, and `memory_test.py` scripts. |
|
|
|
|
|
## Dataset |
|
|
|
|
|
The models were trained on a synthetic dataset of rotating 2D shapes generated using the [Simple Shape Dataset Toolbox](https://github.com/leo27heady/simple-shape-dataset-toolbox). This toolbox allows for procedural generation of customizable shape datasets. |
|
|
|
|
|
## License |
|
|
|
|
|
These pre-trained models are released under the [**MIT**] license. |
|
|
|