|
|
--- |
|
|
language: |
|
|
- tr |
|
|
- en |
|
|
tags: |
|
|
- mt5 |
|
|
- t5 |
|
|
- text-generation-inference |
|
|
- turkish |
|
|
widget: |
|
|
- text: >- |
|
|
Bu hafta hasta olduğum için <extra_id_0> gittim. Midem ağrıyordu ondan |
|
|
dolayı şu an <extra_id_1>. |
|
|
- example_title: Turkish Example 1 |
|
|
- text: Bu gece kar yağacakmış. Yarın yollarda <extra_id_0> olabilir. |
|
|
- example_title: Turkish Example 2 |
|
|
- text: I bought two tickets for NBA match. Do you like <extra_id_0> ? |
|
|
- example_title: English Example 2 |
|
|
--- |
|
|
# Model Card |
|
|
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
Please check [**google/mt5-base**](https://huggingface.co/google/mt5-base) model. This model is pruned version of mt5-base model to only work in Turkish and English. Also for methodology, you can check Russian version of mT5-base [cointegrated/rut5-base](https://huggingface.co/cointegrated/rut5-base). |
|
|
|
|
|
# Usage |
|
|
|
|
|
You should import required libraries by: |
|
|
```python |
|
|
from transformers import T5ForConditionalGeneration, T5Tokenizer |
|
|
import torch |
|
|
``` |
|
|
|
|
|
To load model: |
|
|
```python |
|
|
model = T5ForConditionalGeneration.from_pretrained('bonur/t5-base-tr') |
|
|
tokenizer = T5Tokenizer.from_pretrained('bonur/t5-base-tr') |
|
|
``` |
|
|
|
|
|
To make inference with given text, you can use the following code: |
|
|
```python |
|
|
inputs = tokenizer("Bu hafta hasta olduğum için <extra_id_0> gittim.", return_tensors='pt') |
|
|
with torch.no_grad(): |
|
|
hypotheses = model.generate( |
|
|
**inputs, |
|
|
do_sample=True, top_p=0.95, |
|
|
num_return_sequences=2, |
|
|
repetition_penalty=2.75, |
|
|
max_length=32, |
|
|
) |
|
|
for h in hypotheses: |
|
|
print(tokenizer1.decode(h)) |
|
|
``` |
|
|
|
|
|
You can tune parameters for better result, and this model is ready to fine-tune in bilingual downstream tasks with English and Turkish. |
|
|
|