| # Project | |
| This guide provides instructions for setting up the environment, training the model, and running inference. | |
| ## Quick Start | |
| ### 1. Environment Setup | |
| Follow these steps to set up the required environment. | |
| 1. **Create and activate a new Conda environment:** | |
| ```bash | |
| conda create -n creatidesign python=3.10 -y | |
| conda activate creatidesign | |
| ``` | |
| 2. **Install PyTorch with CUDA 12.0:** | |
| ```bash | |
| conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.0 -c pytorch -c nvidia | |
| ``` | |
| 3. **Install the remaining dependencies:** | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| ### 2. Dataset Preparation | |
| 1. **Download the COCO dataset.** | |
| 2. Update the dataset path in the following file: | |
| `dataloader/unilayout_coco.py` | |
| ### 3. Model Preparation | |
| 1. **Download the pre-trained model weights.** | |
| 2. Update the model path in the training script: | |
| `train/train_coco.sh` | |
| ### 4. Training | |
| To start training the model, run the following command: | |
| ```bash | |
| bash train/train_coco.sh | |
| ``` | |
| ### 5. Testing / Inference | |
| To run inference using a trained model, execute the test script: | |
| ```bash | |
| python test_coco.py | |
| ``` | |
| --- | |
| ## Configuration Notes | |
| 1. **Model Configuration:** | |
| The main model configuration can be found and modified in `train_coco.py`. | |
| 2. **RMA (Region Mask Attention) Settings:** | |
| You can enable or disable RMA based on your available GPU memory. | |
| | Configuration | Settings in `train_coco.py` | Requirements & Performance | | |
| |:---|:---|:---| | |
| | **With RMA** (Full) | `mask_cross_attention_double_layers: 1`<br>`mask_cross_attention_single_layers: 1` | **Slower training speed.**<br>Requires > 96G of GPU memory. | | |
| | **With RMA** (Partial) | `mask_cross_attention_double_layers: 0`<br>`mask_cross_attention_single_layers: 1` | <br>Requires > 64G of GPU memory (e.g., ~80G). | | |
| | **Without RMA** | `mask_cross_attention_double_layers: 0`<br>`mask_cross_attention_single_layers: 0` | **Faster training speed.**<br>Requires < 64G of GPU memory. | |