Upload folder using huggingface_hub

a9bd396 verified about 1 month ago

preview code

raw

history blame contribute delete

3.33 kB

This model was released on 2020-10-23 and added to Hugging Face Transformers on 2020-11-27.

BARThez

BARThez is a BART model designed for French language tasks. Unlike existing French BERT models, BARThez includes a pretrained encoder-decoder, allowing it to generate text as well. This model is also available as a multilingual variant, mBARThez, by continuing pretraining multilingual BART on a French corpus.

You can find all of the original BARThez checkpoints under the BARThez collection.

This model was contributed by moussakam. Refer to the BART docs for more usage examples.

The example below demonstrates how to predict the <mask> token with [Pipeline], [AutoModel], and from the command line.

import torch
from transformers import pipeline

pipeline = pipeline(
    task="fill-mask",
    model="moussaKam/barthez",
    dtype=torch.float16,
    device=0
)
pipeline("Les plantes produisent <mask> grâce à un processus appelé photosynthèse.")

import torch
from transformers import AutoModelForMaskedLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(
    "moussaKam/barthez",
)
model = AutoModelForMaskedLM.from_pretrained(
    "moussaKam/barthez",
    dtype=torch.float16,
    device_map="auto",
)
inputs = tokenizer("Les plantes produisent <mask> grâce à un processus appelé photosynthèse.", return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model(**inputs)
    predictions = outputs.logits

masked_index = torch.where(inputs['input_ids'] == tokenizer.mask_token_id)[1]
predicted_token_id = predictions[0, masked_index].argmax(dim=-1)
predicted_token = tokenizer.decode(predicted_token_id)

print(f"The predicted token is: {predicted_token}")

echo -e "Les plantes produisent <mask> grâce à un processus appelé photosynthèse." | transformers run --task fill-mask --model moussaKam/barthez --device 0

BarthezTokenizer

[[autodoc]] BarthezTokenizer

BarthezTokenizerFast

[[autodoc]] BarthezTokenizerFast