KugaMaxx's picture
import model files
63845c1
11/29/2025 14:21:48 - INFO - __main__ - Distributed environment: DistributedType.NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Mixed precision type: fp16
11/29/2025 14:21:48 - INFO - __main__ - Starting script: train_controlnet.py
11/29/2025 14:21:50 - INFO - __main__ - Initializing controlnet weights from unet
11/29/2025 14:21:52 - INFO - __main__ - Training Arguments:
pretrained_model_name_or_path: stable-diffusion-v1-5/stable-diffusion-v1-5
controlnet_model_name_or_path: None
revision: None
variant: None
trust_remote_code: False
dataset_name_or_path: /home/23132798r/workspace/tmp-smoke/data/controlnet
dataset_config_name: None
image_column: image
conditioning_image_column: conditioning_image
caption_column: text
resolution: 512
center_crop: False
random_flip: False
validation_ids: [1500, 5500, 8500]
validation_steps: 10000
output_dir: ./output-controlnet
cache_dir: None
logging_dir: logs
tracker_project_name: controlnet-training
checkpointing_steps: None
checkpoints_total_limit: None
resume_from_checkpoint: None
report_to: tensorboard
seed: 42
train_batch_size: 16
num_train_epochs: 3
max_train_steps: None
gradient_accumulation_steps: 1
gradient_checkpointing: False
dataloader_num_workers: 8
noise_offset: 0.1
prediction_type: None
adam_beta1: 0.9
adam_beta2: 0.999
adam_weight_decay: 0.01
adam_epsilon: 1e-08
max_grad_norm: 1.0
learning_rate: 1e-05
scale_lr: False
lr_scheduler: constant
lr_warmup_steps: 0
mixed_precision: fp16
use_8bit_adam: False
allow_tf32: False
enable_xformers_memory_efficient_attention: False
local_rank: -1
11/29/2025 14:21:52 - INFO - __main__ - ControlNet Model Config:
FrozenDict({'in_channels': 4, 'conditioning_channels': 3, 'flip_sin_to_cos': True, 'freq_shift': 0, 'down_block_types': ['CrossAttnDownBlock2D', 'CrossAttnDownBlock2D', 'CrossAttnDownBlock2D', 'DownBlock2D'], 'mid_block_type': 'UNetMidBlock2DCrossAttn', 'only_cross_attention': False, 'block_out_channels': [320, 640, 1280, 1280], 'layers_per_block': 2, 'downsample_padding': 1, 'mid_block_scale_factor': 1, 'act_fn': 'silu', 'norm_num_groups': 32, 'norm_eps': 1e-05, 'cross_attention_dim': 768, 'transformer_layers_per_block': 1, 'encoder_hid_dim': None, 'encoder_hid_dim_type': None, 'attention_head_dim': 8, 'num_attention_heads': None, 'use_linear_projection': False, 'class_embed_type': None, 'addition_embed_type': None, 'addition_time_embed_dim': None, 'num_class_embeds': None, 'upcast_attention': False, 'resnet_time_scale_shift': 'default', 'projection_class_embeddings_input_dim': None, 'controlnet_conditioning_channel_order': 'rgb', 'conditioning_embedding_out_channels': (16, 32, 96, 256), 'global_pool_conditions': False, 'addition_embed_type_num_heads': 64, '_use_default_values': ['global_pool_conditions', 'addition_embed_type_num_heads']})
11/29/2025 14:21:54 - INFO - __main__ - ============ Training Begins ============
11/29/2025 14:21:54 - INFO - __main__ - Num Epochs = 3
11/29/2025 14:21:54 - INFO - __main__ - Instantaneous batch size per device = 16
11/29/2025 14:21:54 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 16
11/29/2025 14:21:54 - INFO - __main__ - Gradient Accumulation steps = 1
11/29/2025 14:21:54 - INFO - __main__ - Total optimization steps = 45000
11/29/2025 16:56:53 - INFO - __main__ - Running validation...
11/29/2025 18:15:04 - INFO - accelerate.accelerator - Saving current state to output-controlnet/checkpoint-15000
11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Optimizer state saved in output-controlnet/checkpoint-15000/optimizer.bin
11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Scheduler state saved in output-controlnet/checkpoint-15000/scheduler.bin
11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Sampler state for dataloader 0 saved in output-controlnet/checkpoint-15000/sampler.bin
11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Sampler state for dataloader 1 saved in output-controlnet/checkpoint-15000/sampler_1.bin
11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Gradient scaler state saved in output-controlnet/checkpoint-15000/scaler.pt
11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Random states saved in output-controlnet/checkpoint-15000/random_states_0.pkl
11/29/2025 18:15:11 - INFO - __main__ - Saved state to output-controlnet/checkpoint-15000
11/29/2025 18:15:12 - INFO - __main__ - Epoch 0 | Global Step 15000
11/29/2025 19:32:33 - INFO - __main__ - Running validation...
11/29/2025 22:08:22 - INFO - accelerate.accelerator - Saving current state to output-controlnet/checkpoint-30000
11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Optimizer state saved in output-controlnet/checkpoint-30000/optimizer.bin
11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Scheduler state saved in output-controlnet/checkpoint-30000/scheduler.bin
11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Sampler state for dataloader 0 saved in output-controlnet/checkpoint-30000/sampler.bin
11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Sampler state for dataloader 1 saved in output-controlnet/checkpoint-30000/sampler_1.bin
11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Gradient scaler state saved in output-controlnet/checkpoint-30000/scaler.pt
11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Random states saved in output-controlnet/checkpoint-30000/random_states_0.pkl
11/29/2025 22:08:29 - INFO - __main__ - Saved state to output-controlnet/checkpoint-30000
11/29/2025 22:08:29 - INFO - __main__ - Running validation...
11/29/2025 22:08:32 - INFO - __main__ - Epoch 1 | Global Step 30000
11/30/2025 00:43:43 - INFO - __main__ - Running validation...
11/30/2025 02:01:21 - INFO - accelerate.accelerator - Saving current state to output-controlnet/checkpoint-45000
11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Optimizer state saved in output-controlnet/checkpoint-45000/optimizer.bin
11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Scheduler state saved in output-controlnet/checkpoint-45000/scheduler.bin
11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Sampler state for dataloader 0 saved in output-controlnet/checkpoint-45000/sampler.bin
11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Sampler state for dataloader 1 saved in output-controlnet/checkpoint-45000/sampler_1.bin
11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Gradient scaler state saved in output-controlnet/checkpoint-45000/scaler.pt
11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Random states saved in output-controlnet/checkpoint-45000/random_states_0.pkl
11/30/2025 02:01:28 - INFO - __main__ - Saved state to output-controlnet/checkpoint-45000
11/30/2025 02:01:28 - INFO - __main__ - Epoch 2 | Global Step 45000
11/30/2025 02:01:34 - INFO - __main__ - Running validation...
11/30/2025 02:01:37 - INFO - __main__ - Finished!