| language: | |
| - de | |
| - en | |
| - es | |
| - fr | |
| - it | |
| - ja | |
| - ru | |
| - uk | |
| - multilingual | |
| license: cc-by-sa-4.0 | |
| tags: | |
| - translation | |
| # TakoMT | |
| This is a translation model using Marian-NMT. | |
| For more details, please see [my repository](https://github.com/s-taka/fugumt). | |
| In addition to the data listed in the repository I also used [ParaCrawl](https://paracrawl.eu/). | |
| * source languages: de, en, es, fr, it, ru, uk | |
| * target language: ja | |
| ### How to use | |
| This model uses transformers and sentencepiece. | |
| ```python | |
| !pip install transformers sentencepiece | |
| ``` | |
| You can use this model directly with a pipeline: | |
| ```python | |
| from transformers import pipeline | |
| tako_translator = pipeline('translation', model='staka/takomt') | |
| tako_translator('This is a cat.') | |
| ``` | |
| ### Eval results | |
| The results of the evaluation using [tatoeba](https://tatoeba.org/ja)(randomly selected 500 sentences) are as follows: | |
| |source |target |BLEU(*1)| | |
| |-------|-------|--------| | |
| |de |ja |27.8 | | |
| |en |ja |28.4 | | |
| |es |ja |32.0 | | |
| |fr |ja |27.9 | | |
| |it |ja |24.3 | | |
| |ru |ja |27.3 | | |
| |uk |ja |29.8 | | |
| (*1) sacrebleu --tokenize ja-mecab | |