File size: 2,031 Bytes
76e754d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
# Project

This guide provides instructions for setting up the environment, training the model, and running inference.

## Quick Start

### 1. Environment Setup

Follow these steps to set up the required environment.

1.  **Create and activate a new Conda environment:**
    ```bash
    conda create -n creatidesign python=3.10 -y
    conda activate creatidesign
    ```

2.  **Install PyTorch with CUDA 12.0:**
    ```bash
    conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.0 -c pytorch -c nvidia
    ```

3.  **Install the remaining dependencies:**
    ```bash
    pip install -r requirements.txt
    ```

### 2. Dataset Preparation

1.  **Download the COCO dataset.**
2.  Update the dataset path in the following file:
    `dataloader/unilayout_coco.py`

### 3. Model Preparation

1.  **Download the pre-trained model weights.**
2.  Update the model path in the training script:
    `train/train_coco.sh`

### 4. Training

To start training the model, run the following command:

```bash
bash train/train_coco.sh
```

### 5. Testing / Inference

To run inference using a trained model, execute the test script:

```bash
python test_coco.py
```

---

## Configuration Notes

1.  **Model Configuration:**
    The main model configuration can be found and modified in `train_coco.py`.

2.  **RMA (Region Mask Attention) Settings:**
    You can enable or disable RMA based on your available GPU memory.

| Configuration | Settings in `train_coco.py` | Requirements & Performance |
|:---|:---|:---|
| **With RMA** (Full) | `mask_cross_attention_double_layers: 1`<br>`mask_cross_attention_single_layers: 1` | **Slower training speed.**<br>Requires > 96G of GPU memory. |
| **With RMA** (Partial) | `mask_cross_attention_double_layers: 0`<br>`mask_cross_attention_single_layers: 1` | <br>Requires > 64G of GPU memory (e.g., ~80G). |
| **Without RMA** | `mask_cross_attention_double_layers: 0`<br>`mask_cross_attention_single_layers: 0` | **Faster training speed.**<br>Requires < 64G of GPU memory. |