How to Fine-Tune Vision Layers with LoRA?

#12
by JulioSnchezD - opened

I'm trying to fine-tune only the vision layers of the model using LoRA, but encountering issues where the model doesn't learn (evaluation loss remains constant). Has anyone successfully implemented this?

What I've Tried:
LoRA configuration targeting vision projection layers (_proj layers in vision encoder)
Various learning rates (from 1e-5 to 5e-3)
Verified vision layers are trainable (requires_grad=True)
Different batch sizes and gradient accumulation steps

Specific Issues:
The loss doesn't decrease when only vision layers are tuned
Language layers fine-tune normally when targeted

Any advice or working examples would be greatly appreciated!

JulioSnchezD changed discussion status to closed

Sign up or log in to comment