Finetune loss never decrease

#36
by fahadh4ilyas - opened

I tried to finetune this model using diffuser repo. Basically I'm following this blog's method: Diffusers welcomes FLUX-2. But, the loss result never decreased even after 5 epoch. Is there any missing step that I should do?

Here is the loss graph:

image

Here is the parameters:

#! /bin/bash

MLFLOW_TRACKING_URI=file:///nvme/fahadh/mlruns MLFLOW_EXPERIMENT_NAME=flux2-train accelerate launch \
  --config_file /nvme/fahadh/train-flux/config.yaml \
  /nvme/fahadh/train-flux/diffusers/examples/dreambooth/train_dreambooth_lora_flux2.py \
  --pretrained_model_name_or_path="/nvme/fahadh/models/FLUX.2-dev"  \
  --mixed_precision="bf16" \
  --gradient_checkpointing \
  --cache_latents \
  --offload \
  --remote_text_encoder \
  --caption_column="caption"\
  --dataset_name="/nvme/fahadh/datasets/image-dataset" \
  --output_dir="/nvme/fahadh/train-flux/flux2_LoRA-final" \
  --instance_prompt="" \
  --train_batch_size=2 \
  --guidance_scale=1 \
  --gradient_accumulation_steps=1 \
  --optimizer="prodigy" \
  --learning_rate=1.0 \
  --report_to="mlflow" \
  --lr_scheduler="cosine" \
  --lr_warmup_steps=0 \
  --checkpointing_steps=250 \
  --checkpoints_total_limit=2 \
  --num_train_epochs=5 \
  --rank=32 \
  --lora_alpha=32 \
  --lora_layers="attn.to_k,attn.to_q,attn.to_v,attn.to_out.0,attn.add_k_proj,attn.add_q_proj,attn.add_v_proj,attn.to_add_out,ff.net.0.proj,ff.net.2,ff_context.net.0.proj,ff_context.net.2" \
  --aspect_ratio_buckets="768,1376;1024,1024;1024,1536;1200,896;1376,768;1536,1024" \
  --seed="0" \
  --torch_clear_cache_step=50 \
  --bnb_quantization_config_path="/nvme/fahadh/train-flux/4bit_config.json"

Sign up or log in to comment