How to load with HF Transformers?

#17

by jhflow - opened Feb 27, 2024

Feb 27, 2024

Hi, Thank you for your remarkable work!. I'm really impressed by the performance of this model.

For some reason, I want to load this model via Huggingface transformers (AutoModel.from_pretrinaed or something) not via FlagEmbdding.

Can I do so?

Shitao

Beijing Academy of Artificial Intelligence org Feb 27, 2024

Yes, you can load it in the same way with bge-1.5: https://github.com/FlagOpen/FlagEmbedding/tree/master/FlagEmbedding/baai_general_embedding#using-huggingface-transformers

jhflow changed discussion status to closed Feb 27, 2024

jhflow

Feb 27, 2024

Thank you!

Calvinnncy97

Aug 29, 2024

How can I get dense, colbert embeddings with transformers?

Given

from transformers import AutoModel, AutoTokenizer
from torch import Tensor
import torch

model_path = 'BAAI/bge-m3'
model = AutoModel.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)

test_sentence = ["this is a test sentence"]

batch_dict = tokenizer(test_sentence, return_tensors='pt', max_length=128, padding=True, truncation=True)
outputs = model(**batch_dict)

I get BaseModelOutputWithPoolingAndCrossAttentions with pooler_output and last_hidden_state keys. Is pooler_output the CLS embedding and last_hidden_state all the token embeddings?

Kindly clarify. Thank you.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment