Zhongfang Zhuang
Update README.md
560673e verified
metadata
license: apache-2.0
datasets:
  - timm/imagenet-1k-wds
language:
  - en
base_model:
  - facebook/DiT-XL-2-512

DiT Model

This repository contains the implementation of the DiT (Diffusion Transformer) model, which leverages NdLinear layers for efficient multi-dimensional linear transformations. The model is designed to be compact yet powerful, suitable for various tasks requiring high-dimensional data processing.

Overview

The DiT model is built using several components:

  • NdLinear: A custom PyTorch layer for projecting tensors into multi-space representations, capturing multivariate structures.
  • NdMlp: A multi-layer perceptron using NdLinear layers for enhanced feature extraction.
  • NdTimestepEmbedder: Embeds scalar timesteps into vector representations using NdLinear transformations.

Files

  • mlp.py: Contains the implementation of various MLP architectures, including NdMlp and GluMlp.
  • models_hf.py: Defines the DiT model architecture, including the DiTBlock and FinalLayer.
  • ndlinear.py: Implements the NdLinear layer, which is central to the model's ability to handle multi-dimensional data efficiently.

Installation

To use the DiT model, ensure you have the required dependencies installed:

pip install torch transformers==4.52.4

License

This project is licensed under the MIT License. See the LICENSE file for more details.