Project
This guide provides instructions for setting up the environment, training the model, and running inference.
Quick Start
1. Environment Setup
Follow these steps to set up the required environment.
Create and activate a new Conda environment:
conda create -n creatidesign python=3.10 -y conda activate creatidesignInstall PyTorch with CUDA 12.0:
conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.0 -c pytorch -c nvidiaInstall the remaining dependencies:
pip install -r requirements.txt
2. Dataset Preparation
- Download the COCO dataset.
- Update the dataset path in the following file:
dataloader/unilayout_coco.py
3. Model Preparation
- Download the pre-trained model weights.
- Update the model path in the training script:
train/train_coco.sh
4. Training
To start training the model, run the following command:
bash train/train_coco.sh
5. Testing / Inference
To run inference using a trained model, execute the test script:
python test_coco.py
Configuration Notes
Model Configuration: The main model configuration can be found and modified in
train_coco.py.RMA (Region Mask Attention) Settings: You can enable or disable RMA based on your available GPU memory.
| Configuration | Settings in train_coco.py |
Requirements & Performance |
|---|---|---|
| With RMA (Full) | mask_cross_attention_double_layers: 1mask_cross_attention_single_layers: 1 |
Slower training speed. Requires > 96G of GPU memory. |
| With RMA (Partial) | mask_cross_attention_double_layers: 0mask_cross_attention_single_layers: 1 |
Requires > 64G of GPU memory (e.g., ~80G). |
| Without RMA | mask_cross_attention_double_layers: 0mask_cross_attention_single_layers: 0 |
Faster training speed. Requires < 64G of GPU memory. |