Model Card for Model ID
The model is an implementation of self-similar neural networks. To be more precise, the model is a modular, regular-topological, self-similar neural network. This class of networks is described in more detail in the book "SELF-SIMILAR NEURAL NETWORKS OF RAPID LEARNING" by A.Y. Dorogov
Model Details
Model Description
- Developed by: Slepovichev Ivan
- Model type: Fast Self-Similar Neural Network
- Language(s) (NLP): Russian
- License: MIT
Model Sources
- Repository: https://gitverse.ru/gurgutan/fast-nn
Uses
The model can be used in areas such as Computer Vision, Large Language Models, and Text Classifiers.
How to Get Started with the Model
Detailed description in fast-nn repo.
Training Details
Training Data
Dataset for training created in demand by core.datalib.TextTensorDataset class from raw text file. Details in TextTensorDataset docstring in project repo
Training Procedure
Start train with command:
uv run src/train.py --config CONFIG_PATH --model MODEL_PATH [--epochs N] [--batch-size N] [--data DATA_PATH]
Detailed description in fast-nn repo.
Training Hyperparameters
- Training regime: Model was trained in fp32 mixed precision regime
Speeds, Sizes, Times
Total params: 84,236,838 Trainable params: 84,234,278 Non-trainable params: 2,560 Total mult-adds (Units.GIGABYTES): 107.38
Evaluation
Testing Data, Factors & Metrics
Testing Data
Model was tested on A.N. Tolstoy "War and Peace" book.
Metrics
Train loss: 0.008 Eval loss: 0.12
Model Examination
Compute Infrastructure
Hardware
Model was trained in ~ 16 hours on one GeForce RTX™ 3090 Ti.
Software
Linux Ubuntu 24.04 CUDA Toolkit 12.9 Python 3.12