Model Card for Model ID

The model is an implementation of self-similar neural networks. To be more precise, the model is a modular, regular-topological, self-similar neural network. This class of networks is described in more detail in the book "SELF-SIMILAR NEURAL NETWORKS OF RAPID LEARNING" by A.Y. Dorogov

Model Details

Model Description

  • Developed by: Slepovichev Ivan
  • Model type: Fast Self-Similar Neural Network
  • Language(s) (NLP): Russian
  • License: MIT

Model Sources

Uses

The model can be used in areas such as Computer Vision, Large Language Models, and Text Classifiers.

How to Get Started with the Model

Detailed description in fast-nn repo.

Training Details

Training Data

Dataset for training created in demand by core.datalib.TextTensorDataset class from raw text file. Details in TextTensorDataset docstring in project repo

Training Procedure

Start train with command:

uv run src/train.py --config CONFIG_PATH --model MODEL_PATH [--epochs N] [--batch-size N] [--data DATA_PATH]

Detailed description in fast-nn repo.

Training Hyperparameters

  • Training regime: Model was trained in fp32 mixed precision regime

Speeds, Sizes, Times

Total params: 84,236,838 Trainable params: 84,234,278 Non-trainable params: 2,560 Total mult-adds (Units.GIGABYTES): 107.38

Evaluation

Testing Data, Factors & Metrics

Testing Data

Model was tested on A.N. Tolstoy "War and Peace" book.

Metrics

Train loss: 0.008 Eval loss: 0.12

Model Examination

Compute Infrastructure

Hardware

Model was trained in ~ 16 hours on one GeForce RTX™ 3090 Ti.

Software

Linux Ubuntu 24.04 CUDA Toolkit 12.9 Python 3.12

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including gurgutan/FastSSLM-n5-c4-k128