File size: 2,031 Bytes
76e754d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
# Project
This guide provides instructions for setting up the environment, training the model, and running inference.
## Quick Start
### 1. Environment Setup
Follow these steps to set up the required environment.
1. **Create and activate a new Conda environment:**
```bash
conda create -n creatidesign python=3.10 -y
conda activate creatidesign
```
2. **Install PyTorch with CUDA 12.0:**
```bash
conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.0 -c pytorch -c nvidia
```
3. **Install the remaining dependencies:**
```bash
pip install -r requirements.txt
```
### 2. Dataset Preparation
1. **Download the COCO dataset.**
2. Update the dataset path in the following file:
`dataloader/unilayout_coco.py`
### 3. Model Preparation
1. **Download the pre-trained model weights.**
2. Update the model path in the training script:
`train/train_coco.sh`
### 4. Training
To start training the model, run the following command:
```bash
bash train/train_coco.sh
```
### 5. Testing / Inference
To run inference using a trained model, execute the test script:
```bash
python test_coco.py
```
---
## Configuration Notes
1. **Model Configuration:**
The main model configuration can be found and modified in `train_coco.py`.
2. **RMA (Region Mask Attention) Settings:**
You can enable or disable RMA based on your available GPU memory.
| Configuration | Settings in `train_coco.py` | Requirements & Performance |
|:---|:---|:---|
| **With RMA** (Full) | `mask_cross_attention_double_layers: 1`<br>`mask_cross_attention_single_layers: 1` | **Slower training speed.**<br>Requires > 96G of GPU memory. |
| **With RMA** (Partial) | `mask_cross_attention_double_layers: 0`<br>`mask_cross_attention_single_layers: 1` | <br>Requires > 64G of GPU memory (e.g., ~80G). |
| **Without RMA** | `mask_cross_attention_double_layers: 0`<br>`mask_cross_attention_single_layers: 0` | **Faster training speed.**<br>Requires < 64G of GPU memory. | |