Sonetto702
/

code

Model card Files Files and versions

code / models /README.md

Sonetto702's picture

Upload single large file

76e754d verified about 2 months ago

|

history blame contribute delete

2.03 kB

	# Project

	This guide provides instructions for setting up the environment, training the model, and running inference.

	## Quick Start

	### 1. Environment Setup

	Follow these steps to set up the required environment.

	1. Create and activate a new Conda environment:
	```bash
	conda create -n creatidesign python=3.10 -y
	conda activate creatidesign
	```

	2. Install PyTorch with CUDA 12.0:
	```bash
	conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.0 -c pytorch -c nvidia
	```

	3. Install the remaining dependencies:
	```bash
	pip install -r requirements.txt
	```

	### 2. Dataset Preparation

	1. Download the COCO dataset.
	2. Update the dataset path in the following file:
	`dataloader/unilayout_coco.py`

	### 3. Model Preparation

	1. Download the pre-trained model weights.
	2. Update the model path in the training script:
	`train/train_coco.sh`

	### 4. Training

	To start training the model, run the following command:

	```bash
	bash train/train_coco.sh
	```

	### 5. Testing / Inference

	To run inference using a trained model, execute the test script:

	```bash
	python test_coco.py
	```

	---

	## Configuration Notes

	1. Model Configuration:
	The main model configuration can be found and modified in `train_coco.py`.

	2. RMA (Region Mask Attention) Settings:
	You can enable or disable RMA based on your available GPU memory.

	\| Configuration \| Settings in `train_coco.py` \| Requirements & Performance \|
	\|:---\|:---\|:---\|
	\| With RMA (Full) \| `mask_cross_attention_double_layers: 1`<br>`mask_cross_attention_single_layers: 1` \| Slower training speed.<br>Requires > 96G of GPU memory. \|
	\| With RMA (Partial) \| `mask_cross_attention_double_layers: 0`<br>`mask_cross_attention_single_layers: 1` \| <br>Requires > 64G of GPU memory (e.g., ~80G). \|
	\| Without RMA \| `mask_cross_attention_double_layers: 0`<br>`mask_cross_attention_single_layers: 0` \| Faster training speed.<br>Requires < 64G of GPU memory. \|