# Project This guide provides instructions for setting up the environment, training the model, and running inference. ## Quick Start ### 1. Environment Setup Follow these steps to set up the required environment. 1. **Create and activate a new Conda environment:** ```bash conda create -n creatidesign python=3.10 -y conda activate creatidesign ``` 2. **Install PyTorch with CUDA 12.0:** ```bash conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.0 -c pytorch -c nvidia ``` 3. **Install the remaining dependencies:** ```bash pip install -r requirements.txt ``` ### 2. Dataset Preparation 1. **Download the COCO dataset.** 2. Update the dataset path in the following file: `dataloader/unilayout_coco.py` ### 3. Model Preparation 1. **Download the pre-trained model weights.** 2. Update the model path in the training script: `train/train_coco.sh` ### 4. Training To start training the model, run the following command: ```bash bash train/train_coco.sh ``` ### 5. Testing / Inference To run inference using a trained model, execute the test script: ```bash python test_coco.py ``` --- ## Configuration Notes 1. **Model Configuration:** The main model configuration can be found and modified in `train_coco.py`. 2. **RMA (Region Mask Attention) Settings:** You can enable or disable RMA based on your available GPU memory. | Configuration | Settings in `train_coco.py` | Requirements & Performance | |:---|:---|:---| | **With RMA** (Full) | `mask_cross_attention_double_layers: 1`
`mask_cross_attention_single_layers: 1` | **Slower training speed.**
Requires > 96G of GPU memory. | | **With RMA** (Partial) | `mask_cross_attention_double_layers: 0`
`mask_cross_attention_single_layers: 1` |
Requires > 64G of GPU memory (e.g., ~80G). | | **Without RMA** | `mask_cross_attention_double_layers: 0`
`mask_cross_attention_single_layers: 0` | **Faster training speed.**
Requires < 64G of GPU memory. |