Trainable layers impossible to control in Vision Tower

by edmond - opened Apr 19, 2024

Apr 19, 2024

•

edited Apr 19, 2024

It seems that no matter what setup I chose, whatever layer in the Vision Tower I chose to be trainable like here :
for name, param in self.llm.named_parameters():
param.requires_grad = (('.ln.' in name.lower()
or 'norm' in name.lower() or
'transformer.h.0' in name.lower() or
'vision_model.encoder.layers.0.' in name.lower() or
'vision_model.encoder.layers.1.' in name.lower()) and
('out_proj' not in name.lower()))
it changes nothing, the training behavior is the same, why is that ?

(I printed which layers are trainable and everythink is fine, weirdly pytorch understood well my changes in the Vision Tower because the number of trainable weights changed, and I know I am doing it right because my changes in the trainable layers in the LLM part have an actual impact on the training/val loss and weights value)

Oyoy1235

Apr 24, 2024

We will try to figure out this question. And are you trying to finetune imp in your custom datasets?

edmond

Apr 24, 2024

Yes I am trying to do that, I did successfully, but the vision tower is resisting me.

Oyoy1235

Apr 24, 2024

Perhaps you can try to fine tune Imp in your custom dataset by the script we provided in imp github , and we will update the new phi2 version in github.

edmond

Apr 24, 2024

Ah okok sure, I was hoping I could just stick to my usual Hugginface model training, but I can try that too when I find some time

edmond changed discussion status to closed Apr 24, 2024

edmond

May 14, 2024

I know, the vision tower must be extracted from the model to force the trainable weights.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment