Safetensors
English
bert_hash
custom_code

Not working with transformer 5.0

#1
by Aexyno - opened

hi.. it is not working with transformers 5.0 and giving item out of bounds error.. I tried solving it.. but unable to do it

Thank you for mentioning. Once Transformers 5.X is out a little longer, I'll make this update. That will effectively make this model only work with 5.0+

Thanks you for your quick reply

I am still pasting minimal code and error for easy debug. it works with transformers 4.57.6 but not with 5.0.0, 5.1.0 and 5.2.0. If you can guide.. I can raise a PR or paste solution here..

--------------MINIMAL CODE:--------------

from transformers import AutoTokenizer, AutoModel
import torch

tokenizer = AutoTokenizer.from_pretrained("neuml/bert-hash-nano", trust_remote_code=True)
model = AutoModel.from_pretrained("neuml/bert-hash-nano", trust_remote_code=True)
model.eval()

inputs = tokenizer("The quick brown fox jumps over the lazy dog.", return_tensors="pt", padding=True, truncation=True, max_length=128)
with torch.no_grad():
outputs = model(**inputs)

print(f"Last hidden state shape: {outputs.last_hidden_state.shape}")

---------ERROR---------

Notes:

  • MISSING :those params were newly initialized because missing from the checkpoint. Consider training on your downstream task.
    Traceback (most recent call last):
    File "/Users/ai/from_scratch/distil/inference_bert_hash.py", line 10, in
    outputs = model(**inputs)
    File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
    File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
    File "/Users/.cache/huggingface/modules/transformers_modules/neuml/bert_hyphen_hash_hyphen_nano/606865e08bd85cb37effbd3df2041d712307c2be/modeling_bert_hash.py", line 229, in forward
    embedding_output = self.embeddings(
    input_ids=input_ids,
    ...<3 lines>...
    past_key_values_length=past_key_values_length,
    )
    File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
    File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
    File "/Users/.cache/huggingface/modules/transformers_modules/neuml/bert_hyphen_hash_hyphen_nano/606865e08bd85cb37effbd3df2041d712307c2be/modeling_bert_hash.py", line 110, in forward
    token_type_embeddings = self.token_type_embeddings(token_type_ids)
    File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1776, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
    File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/torch/nn/modules/module.py", line 1787, in _call_impl
    return forward_call(*args, **kwargs)
    File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/torch/nn/modules/sparse.py", line 191, in forward
    return F.embedding(
    ~~~~~~~~~~~^
    input,
    ^^^^^^
    ...<5 lines>...
    self.sparse,
    ^^^^^^^^^^^^
    )
    ^
    File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/torch/nn/functional.py", line 2567, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

IndexError: index out of range in self

Unfortunately, this isn't a task I have time to debug at the moment. If you look at the code for the BERT model in 5.X vs the last version of 4.X you likely would be able to see the changes. The vast majority of this code is copied from the BERT model.

I understand.. Thank you :)

Sign up or log in to comment