config.json for pooling is incorrect

by HectorL - opened Mar 24, 2023

Mar 24, 2023

the config JSON for pooling includes arguments that are not valid for the Pooling function.
The following are in the config (not in this order):

"word_embedding_dimension": 768,
"pooling_mode_cls_token": false,
"pooling_mode_mean_tokens": true,
"pooling_mode_max_tokens": false,
"pooling_mode_mean_sqrt_len_tokens": false,

"pooling_mode_weightedmean_tokens": false,
"pooling_mode_lasttoken": false

The Pooling function only accepts the top 5 arguments.
Model will not instantiate without removing the bottom two keys from the config.

I'm cloning the repo and using:
model = SentenceTranformer("local_path")

multi-train

NLP Group of The University of Hong Kong org Mar 27, 2023

Hi, Thanks a lot for your interest in INSTRUCTOR!

As we have overwritten several classes of sentence transformer library, you may need to install the InstructorEmbedding package following instructions at https://github.com/HKUNLP/instructor-embedding#installation.

After that, you can use our INSTRUCTOR model as

from InstructorEmbedding import INSTRUCTOR
model = INSTRUCTOR('hkunlp/instructor-large')

Feel free to add any further questions or comments!

HectorL

Mar 27, 2023

No issues using your recommended method. I was also able to get the cloning method to work by removing the unaccepted keys. Are there any negative consequences to removing the following keys from the config?

"pooling_mode_weightedmean_tokens": false,
"pooling_mode_lasttoken": false

Its working great for my embedding task. Just curious about this.

multi-train

NLP Group of The University of Hong Kong org Mar 28, 2023

Hi, thanks a lot for your comments!

By removing unnecessary keys and using the SentenceTranformer library, it seems that you will not be able to add instructions for embedding calculation.

HectorL changed discussion status to closed Mar 28, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment