First homewok for NLP course in MSU from VK. Causal Transformer with Alibi bias and SwiGLU MLP. Can generate jokes in Russian language. Only the very small "nano" model with only 0.51 million parametrs and 1024 vocabluary size was implemented due to weak computing capabilities.

This model has been pushed to the Hub using the PytorchModelHubMixin integration:

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train ivio05/llm-course-hw1