How to Fine-Tune Vision Layers with LoRA?

#12

by JulioSnchezD - opened Jun 2, 2025

Jun 2, 2025

I'm trying to fine-tune only the vision layers of the model using LoRA, but encountering issues where the model doesn't learn (evaluation loss remains constant). Has anyone successfully implemented this?

What I've Tried:
LoRA configuration targeting vision projection layers (_proj layers in vision encoder)
Various learning rates (from 1e-5 to 5e-3)
Verified vision layers are trainable (requires_grad=True)
Different batch sizes and gradient accumulation steps

Specific Issues:
The loss doesn't decrease when only vision layers are tuned
Language layers fine-tune normally when targeted

Any advice or working examples would be greatly appreciated!

JulioSnchezD changed discussion status to closed Jun 7, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment