model context width is 2560 not 2048
#73
by
amitport
- opened
When loading the model, the weights have 2560 dim, not 2048, regardless of the input they were trained on.
If the model was not trained on samples longer than 2048, this is just a waste of memory; if it was trained on 2560 len samples, the docs need to be updated.
Which one is it?
Thank you
sorry mixed n_positions with n_embd
amitport
changed discussion status to
closed