| 11/29/2025 14:21:48 - INFO - __main__ - Distributed environment: DistributedType.NO | |
| Num processes: 1 | |
| Process index: 0 | |
| Local process index: 0 | |
| Device: cuda | |
| Mixed precision type: fp16 | |
| 11/29/2025 14:21:48 - INFO - __main__ - Starting script: train_controlnet.py | |
| 11/29/2025 14:21:50 - INFO - __main__ - Initializing controlnet weights from unet | |
| 11/29/2025 14:21:52 - INFO - __main__ - Training Arguments: | |
| pretrained_model_name_or_path: stable-diffusion-v1-5/stable-diffusion-v1-5 | |
| controlnet_model_name_or_path: None | |
| revision: None | |
| variant: None | |
| trust_remote_code: False | |
| dataset_name_or_path: /home/23132798r/workspace/tmp-smoke/data/controlnet | |
| dataset_config_name: None | |
| image_column: image | |
| conditioning_image_column: conditioning_image | |
| caption_column: text | |
| resolution: 512 | |
| center_crop: False | |
| random_flip: False | |
| validation_ids: [1500, 5500, 8500] | |
| validation_steps: 10000 | |
| output_dir: ./output-controlnet | |
| cache_dir: None | |
| logging_dir: logs | |
| tracker_project_name: controlnet-training | |
| checkpointing_steps: None | |
| checkpoints_total_limit: None | |
| resume_from_checkpoint: None | |
| report_to: tensorboard | |
| seed: 42 | |
| train_batch_size: 16 | |
| num_train_epochs: 3 | |
| max_train_steps: None | |
| gradient_accumulation_steps: 1 | |
| gradient_checkpointing: False | |
| dataloader_num_workers: 8 | |
| noise_offset: 0.1 | |
| prediction_type: None | |
| adam_beta1: 0.9 | |
| adam_beta2: 0.999 | |
| adam_weight_decay: 0.01 | |
| adam_epsilon: 1e-08 | |
| max_grad_norm: 1.0 | |
| learning_rate: 1e-05 | |
| scale_lr: False | |
| lr_scheduler: constant | |
| lr_warmup_steps: 0 | |
| mixed_precision: fp16 | |
| use_8bit_adam: False | |
| allow_tf32: False | |
| enable_xformers_memory_efficient_attention: False | |
| local_rank: -1 | |
| 11/29/2025 14:21:52 - INFO - __main__ - ControlNet Model Config: | |
| FrozenDict({'in_channels': 4, 'conditioning_channels': 3, 'flip_sin_to_cos': True, 'freq_shift': 0, 'down_block_types': ['CrossAttnDownBlock2D', 'CrossAttnDownBlock2D', 'CrossAttnDownBlock2D', 'DownBlock2D'], 'mid_block_type': 'UNetMidBlock2DCrossAttn', 'only_cross_attention': False, 'block_out_channels': [320, 640, 1280, 1280], 'layers_per_block': 2, 'downsample_padding': 1, 'mid_block_scale_factor': 1, 'act_fn': 'silu', 'norm_num_groups': 32, 'norm_eps': 1e-05, 'cross_attention_dim': 768, 'transformer_layers_per_block': 1, 'encoder_hid_dim': None, 'encoder_hid_dim_type': None, 'attention_head_dim': 8, 'num_attention_heads': None, 'use_linear_projection': False, 'class_embed_type': None, 'addition_embed_type': None, 'addition_time_embed_dim': None, 'num_class_embeds': None, 'upcast_attention': False, 'resnet_time_scale_shift': 'default', 'projection_class_embeddings_input_dim': None, 'controlnet_conditioning_channel_order': 'rgb', 'conditioning_embedding_out_channels': (16, 32, 96, 256), 'global_pool_conditions': False, 'addition_embed_type_num_heads': 64, '_use_default_values': ['global_pool_conditions', 'addition_embed_type_num_heads']}) | |
| 11/29/2025 14:21:54 - INFO - __main__ - ============ Training Begins ============ | |
| 11/29/2025 14:21:54 - INFO - __main__ - Num Epochs = 3 | |
| 11/29/2025 14:21:54 - INFO - __main__ - Instantaneous batch size per device = 16 | |
| 11/29/2025 14:21:54 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 16 | |
| 11/29/2025 14:21:54 - INFO - __main__ - Gradient Accumulation steps = 1 | |
| 11/29/2025 14:21:54 - INFO - __main__ - Total optimization steps = 45000 | |
| 11/29/2025 16:56:53 - INFO - __main__ - Running validation... | |
| 11/29/2025 18:15:04 - INFO - accelerate.accelerator - Saving current state to output-controlnet/checkpoint-15000 | |
| 11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Optimizer state saved in output-controlnet/checkpoint-15000/optimizer.bin | |
| 11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Scheduler state saved in output-controlnet/checkpoint-15000/scheduler.bin | |
| 11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Sampler state for dataloader 0 saved in output-controlnet/checkpoint-15000/sampler.bin | |
| 11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Sampler state for dataloader 1 saved in output-controlnet/checkpoint-15000/sampler_1.bin | |
| 11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Gradient scaler state saved in output-controlnet/checkpoint-15000/scaler.pt | |
| 11/29/2025 18:15:11 - INFO - accelerate.checkpointing - Random states saved in output-controlnet/checkpoint-15000/random_states_0.pkl | |
| 11/29/2025 18:15:11 - INFO - __main__ - Saved state to output-controlnet/checkpoint-15000 | |
| 11/29/2025 18:15:12 - INFO - __main__ - Epoch 0 | Global Step 15000 | |
| 11/29/2025 19:32:33 - INFO - __main__ - Running validation... | |
| 11/29/2025 22:08:22 - INFO - accelerate.accelerator - Saving current state to output-controlnet/checkpoint-30000 | |
| 11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Optimizer state saved in output-controlnet/checkpoint-30000/optimizer.bin | |
| 11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Scheduler state saved in output-controlnet/checkpoint-30000/scheduler.bin | |
| 11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Sampler state for dataloader 0 saved in output-controlnet/checkpoint-30000/sampler.bin | |
| 11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Sampler state for dataloader 1 saved in output-controlnet/checkpoint-30000/sampler_1.bin | |
| 11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Gradient scaler state saved in output-controlnet/checkpoint-30000/scaler.pt | |
| 11/29/2025 22:08:29 - INFO - accelerate.checkpointing - Random states saved in output-controlnet/checkpoint-30000/random_states_0.pkl | |
| 11/29/2025 22:08:29 - INFO - __main__ - Saved state to output-controlnet/checkpoint-30000 | |
| 11/29/2025 22:08:29 - INFO - __main__ - Running validation... | |
| 11/29/2025 22:08:32 - INFO - __main__ - Epoch 1 | Global Step 30000 | |
| 11/30/2025 00:43:43 - INFO - __main__ - Running validation... | |
| 11/30/2025 02:01:21 - INFO - accelerate.accelerator - Saving current state to output-controlnet/checkpoint-45000 | |
| 11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Optimizer state saved in output-controlnet/checkpoint-45000/optimizer.bin | |
| 11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Scheduler state saved in output-controlnet/checkpoint-45000/scheduler.bin | |
| 11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Sampler state for dataloader 0 saved in output-controlnet/checkpoint-45000/sampler.bin | |
| 11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Sampler state for dataloader 1 saved in output-controlnet/checkpoint-45000/sampler_1.bin | |
| 11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Gradient scaler state saved in output-controlnet/checkpoint-45000/scaler.pt | |
| 11/30/2025 02:01:28 - INFO - accelerate.checkpointing - Random states saved in output-controlnet/checkpoint-45000/random_states_0.pkl | |
| 11/30/2025 02:01:28 - INFO - __main__ - Saved state to output-controlnet/checkpoint-45000 | |
| 11/30/2025 02:01:28 - INFO - __main__ - Epoch 2 | Global Step 45000 | |
| 11/30/2025 02:01:34 - INFO - __main__ - Running validation... | |
| 11/30/2025 02:01:37 - INFO - __main__ - Finished! | |