The name of weight for vision_tower in the safetensors file and in the model do not match.

#2
by xf2022 - opened

`class LayerScale(nn.Module):

    def __init__(self, dim, init_values=1e-5, inplace=False, force_fp32=False):
         super().__init__()
        self.inplace = inplace
        self.weight = nn.Parameter(init_values * torch.ones(dim))
        self.force_fp32 = force_fp32`

In the model and the model.safetensors.index.json file, the name of weight for LayerScale is model.vision_tower.vision_tower.blocks.0.ls1.weight.
However, only model.vision_tower.vision_tower.blocks.0.ls1.gamma can be found in the model-00003-of-00004.safetensors file.

xf2022 changed discussion title from The names `model.vision_tower.vision_tower.blocks.0.ls1.gamma` in the safetensors file and `model.vision_tower.vision_tower.blocks.0.ls1.weight` in the model do not match. to The name `model.vision_tower.vision_tower.blocks.0.ls1.gamma` in the safetensors file and `model.vision_tower.vision_tower.blocks.0.ls1.weight` in the model do not match.
xf2022 changed discussion title from The name `model.vision_tower.vision_tower.blocks.0.ls1.gamma` in the safetensors file and `model.vision_tower.vision_tower.blocks.0.ls1.weight` in the model do not match. to The name of weight for vision_tower in the safetensors file and in the model do not match.

In versions of transformers prior to 4.49.0, there was a logic that would fix the key values ​​of the weights.
image
But this logic is not found in the newer versions. Therefore when using newer versions of transformers,model.vision_tower.vision_tower.blocks.0.ls1.weight can not be initialized from the safetensors file.

Sign up or log in to comment