code / models /README.md
Sonetto702's picture
Upload single large file
76e754d verified

Project

This guide provides instructions for setting up the environment, training the model, and running inference.

Quick Start

1. Environment Setup

Follow these steps to set up the required environment.

  1. Create and activate a new Conda environment:

    conda create -n creatidesign python=3.10 -y
    conda activate creatidesign
    
  2. Install PyTorch with CUDA 12.0:

    conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.0 -c pytorch -c nvidia
    
  3. Install the remaining dependencies:

    pip install -r requirements.txt
    

2. Dataset Preparation

  1. Download the COCO dataset.
  2. Update the dataset path in the following file: dataloader/unilayout_coco.py

3. Model Preparation

  1. Download the pre-trained model weights.
  2. Update the model path in the training script: train/train_coco.sh

4. Training

To start training the model, run the following command:

bash train/train_coco.sh

5. Testing / Inference

To run inference using a trained model, execute the test script:

python test_coco.py

Configuration Notes

  1. Model Configuration: The main model configuration can be found and modified in train_coco.py.

  2. RMA (Region Mask Attention) Settings: You can enable or disable RMA based on your available GPU memory.

Configuration Settings in train_coco.py Requirements & Performance
With RMA (Full) mask_cross_attention_double_layers: 1
mask_cross_attention_single_layers: 1
Slower training speed.
Requires > 96G of GPU memory.
With RMA (Partial) mask_cross_attention_double_layers: 0
mask_cross_attention_single_layers: 1

Requires > 64G of GPU memory (e.g., ~80G).
Without RMA mask_cross_attention_double_layers: 0
mask_cross_attention_single_layers: 0
Faster training speed.
Requires < 64G of GPU memory.