|
|
|
|
|
|
|
|
The model was trained from scratch on English–Russian parallel text. |
|
|
|
|
|
English source text was derived from approximately 40 cleaned books. |
|
|
The English corpus was programmatically processed and normalized using shell |
|
|
and Python tooling. |
|
|
|
|
|
Russian translations were produced after the English-side cleaning step, |
|
|
and the resulting parallel data was curated and filtered before training. |
|
|
|
|
|
This model is intended for research and experimental use. |
|
|
|
|
|
Furthermore this model is the Alpha edition and is subject to translation issues. |
|
|
Updates coming in the near weeks. |