Lynxpda
/

vep_ru

+---
+license: cc-by-sa-4.0
+library_name: pytorch
+language:
+- ru
+- vep
+datasets:
+- Lynxpda/back-translated-veps-russian
+---
+# Model Card for Veps - Russian version 1.0
+A model of translation from Vepsian into Russian.
+In archive initial weights of the model trained with OpenNMT-py (Locomotive).
+The model has 457M parameters and is trained from scratch.
+Also presented are model weights converted for Ctranslate2 and a package for installation and use with Argostranslate/Libretranslate.
+## Model Architecture and Objective
+```
+dec_layers: 20
+decoder_type: transformer
+enc_layers: 20
+encoder_type: transformer
+heads: 8
+hidden_size: 512
+max_relative_positions: 20
+model_dtype: fp16
+pos_ffn_activation_fn: gated-gelu
+position_encoding: false
+share_decoder_embeddings: true
+share_embeddings: true
+share_vocab: true
+src_vocab_size: 32000
+tgt_vocab_size: 32000
+transformer_ff: 6144
+word_vec_size: 512
+```
+# Citing & Authors
+Authors: Maksim Migukin, Maksim Kuznetsov, Alexey Kutashov.
+## Credits
+Data compiled by [Opus](https://opus.nlpl.eu/).
+Includes pretrained models from [Stanza](https://github.com/stanfordnlp/stanza/).
+Data from Vepsian [WiKi](https://vep.wikipedia.org/wiki/)
+Data from [Lehme No 2051 // Open corpus of Vepsian and Karelian languages VepKar.](http://dictorpus.krc.karelia.ru/)
+Data from [OMAMEDIA](https://omamedia.ru/)
+CCMatrix
+http://opus.nlpl.eu/CCMatrix-v1.php
+If you use the dataset or code, please cite (pdf) and, please, acknowledge OPUS (bib, pdf) as well for this release.
+This corpus has been extracted from web crawls using the margin-based bitext mining techniques described here. The original distribution is available from http://data.statmt.org/cc-matrix/
+OpenSubtitles
+http://opus.nlpl.eu/OpenSubtitles-v2018.php
+Please cite the following article if you use any part of the corpus in your own work: P. Lison and J. Tiedemann, 2016, OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016)