Instructions to use AgentPublic/dpr-ctx_encoder-fr_qa-camembert with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AgentPublic/dpr-ctx_encoder-fr_qa-camembert with Transformers:
# Load model directly from transformers import AutoTokenizer, DPRContextEncoder tokenizer = AutoTokenizer.from_pretrained("AgentPublic/dpr-ctx_encoder-fr_qa-camembert") model = DPRContextEncoder.from_pretrained("AgentPublic/dpr-ctx_encoder-fr_qa-camembert") - Notebooks
- Google Colab
- Kaggle
Pooler output issue
#1
by ThomasGerald - opened
It seems that the example given in the readme does not work as expected. The pooler weights of cammenBERT are randomly initialised while there are in use during the inference process (using pooler_output). If the model is expected to work as DPRContextEncoder without any "trained" pooler (thus pooling is the output embedding of the first token) the code below shoulds work:
from transformers import AutoTokenizer, AutoModel
query = "Salut, mon chien est-il mignon ?"
tokenizer = AutoTokenizer.from_pretrained("etalab-ia/dpr-ctx_encoder-fr_qa-camembert", do_lower_case=True)
input_ids = tokenizer(query, return_tensors='pt')["input_ids"]
model = AutoModel.from_pretrained("etalab-ia/dpr-ctx_encoder-fr_qa-camembert", return_dict=True)
embeddings = output.last_hidden_state[:,0,:]
print(embeddings)
``