--- license: mit --- # TempVerseFormer - Pre-trained Models [![Hugging Face Hub](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-HuggingFace%20Hub-blue?style=flat-square&logo=huggingface)](https://huggingface.co/LKyluk/TempVerseFormer) [![GitHub Code](https://img.shields.io/github/v/release/leo27heady/TempVerseFormer?label=TempVerseFormer&sstyle=flat-square)](https://github.com/leo27heady/TempVerseFormer) [![Shape Dataset Toolbox](https://img.shields.io/github/v/release/leo27heady/simple-shape-dataset-toolbox?label=shapekit&style=flat-square)](https://github.com/leo27heady/simple-shape-dataset-toolbox) [![WandB Logs](https://img.shields.io/badge/WandB-Training%20Logs-blue?style=flat-square&logo=wandb)](https://wandb.ai/leo27heady/pipe-transformer/reports/TempVerseFormer-Training-Logs--VmlldzoxMTg3OTQ3NQ) This repository hosts pre-trained models for **TempVerseFormer: Temporal Modeling with Reversible Transformers**, a novel architecture introduced in the research article **"Temporal Modeling with Reversible Transformers"**. These models are designed for memory-efficient temporal sequence prediction, particularly for tasks involving continuous and evolving data streams. They are trained on a synthetic dataset of rotating 2D shapes, designed to evaluate temporal modeling capabilities in a controlled environment. ## Models Included This repository contains pre-trained weights for the following models, as described in the research article: * **TempFormer (Vanilla-Transformer):** A standard Vanilla Transformer architecture with temporal chaining, serving as a baseline to compare against TempVerseFormer. * **TempVerseFormer (Rev-Transformer):** The core Reversible Temporal Transformer architecture, leveraging reversible blocks and time-agnostic backpropagation for memory efficiency. * **Standard Transformer (Pipe-Transformer):** A standard Transformer model that predicts only one next element at once. * **LSTM:** A Long Short-Term Memory network, representing a traditional recurrent sequence modeling approach. * **VAE Models:** Variational Autoencoder (VAE) models used for encoding and decoding images to and from a latent space: * **Vanilla VAE:** Standard VAE architecture. Each model checkpoint is provided as a `.pt` file containing the `state_dict` of the trained model. * For all of the models checkpoints available for different training configurations (e.g., with/without temporal patterns).* ## Intended Use These pre-trained models are intended for: * **Research:** Facilitating further research in memory-efficient temporal modeling, reversible architectures, and time-agnostic backpropagation. * **Benchmarking:** Providing baselines for comparison with new temporal sequence modeling architectures. * **Fine-tuning:** Serving as a starting point for fine-tuning on new datasets or for related temporal prediction tasks. * **Demonstration:** Illustrating the capabilities of TempVerseFormer and its memory efficiency advantages. **Please note:** These models were primarily trained and evaluated on a synthetic dataset of rotating shapes. While they demonstrate promising results in this controlled environment, their performance on real-world datasets may vary and require further evaluation and fine-tuning. ## How to Use * **Configuration:** Ensure you use the correct model configuration (e.g., `config_rev_transformer`, `config_vae`) that corresponds to the pre-trained checkpoint you are loading. You can find example configurations in the `configs/train` directory of the [GitHub repository](https://github.com/leo27heady/TempVerseFormer). * **Data Preprocessing:** Input data should be preprocessed in the same way as the training data. Refer to the `ShapeDataset` class in the GitHub repository for details on data loading and preprocessing. * **Device:** Load models and data onto the appropriate device (`'cpu'` or `'cuda'`). * **Evaluation Mode:** Remember to set models to `.eval()` mode for inference. For more detailed usage examples and specific code for different models and tasks, please refer to the [GitHub repository](https://github.com/leo27heady/TempVerseFormer) and the `train.py`, `eval.py`, and `memory_test.py` scripts. ## Dataset The models were trained on a synthetic dataset of rotating 2D shapes generated using the [Simple Shape Dataset Toolbox](https://github.com/leo27heady/simple-shape-dataset-toolbox). This toolbox allows for procedural generation of customizable shape datasets. ## License These pre-trained models are released under the [**MIT**] license.