Updating `hidden_states` value in line 869

This update is meant to solve the following error that occured when trying to finetune the model using peft :

RuntimeError Traceback (most recent call last)
in <cell line: 2>()
1 trainer.deprecated=True
----> 2 trainer.train()

21 frames
~/.cache/huggingface/modules/transformers_modules/inception-mbzuai/jais-13b-chat/96080d1c163804428c4792b8618c2d39661e9d7f/modeling_jais.py in forward(self, input_ids, past_key_values, attention_mask, token_type_ids, position_ids, head_mask, inputs_embeds, encoder_hidden_states, encoder_attention_mask, use_cache, output_attentions, output_hidden_states, return_dict)
867 else:
868 hidden_states = inputs_embeds
→ 869 hidden_states *= torch.tensor(
870 float(self.embeddings_scale), dtype=hidden_states.dtype, device=hidden_states.device
871 )

RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
#

Files changed (1) hide show

modeling_jais.py +5 -3

modeling_jais.py CHANGED Viewed

@@ -866,9 +866,11 @@ class JAISModel(JAISPreTrainedModel):
             hidden_states = inputs_embeds + position_embeds
         else:
             hidden_states = inputs_embeds
-        hidden_states *= torch.tensor(
-            float(self.embeddings_scale), dtype=hidden_states.dtype, device=hidden_states.device
-        )
         if token_type_ids is not None:
             token_type_embeds = self.wte(token_type_ids)

             hidden_states = inputs_embeds + position_embeds
         else:
             hidden_states = inputs_embeds
+        # hidden_states *= torch.tensor(
+        #     float(self.embeddings_scale), dtype=hidden_states.dtype, device=hidden_states.device
+        # )
+        scale_factor_hidden = torch.tensor(float(self.embeddings_scale), dtype=hidden_states.dtype, device=hidden_states.device)
+        hidden_states = hidden_states * scale_factor_hidden
         if token_type_ids is not None:
             token_type_embeds = self.wte(token_type_ids)