## Training Data

The model was trained from scratch on English–Russian parallel text.

English source text was derived from approximately 40 cleaned books.
The English corpus was programmatically processed and normalized using shell
and Python tooling.

Russian translations were produced after the English-side cleaning step,
and the resulting parallel data was curated and filtered before training.

This model is intended for research and experimental use.

Furthermore this model is the Alpha edition and is subject to translation issues. 
Updates coming in the near weeks.