Welcome Back! Today you're gonna see our new model!

What is Supra Mini v2-0.1M?

It is our new model! Trained with love on a single Kaggle T4 GPU!
Llama Architecture
100k parameters, optimized training, overtraining, all thinking on your experience!.

What we did to create this model?

We made a first version but we said: "We think we can scale more and more!", and then we got this! We are so happy that SupraLabs and their models are becoming reality!.

Config

vocab size = 2048,
hidden size = 48,
intermediate size = 96,
hidden layers = 3,
num attention heads= 4

What the model CAN'T do

The model can't: Think, Reason, chat(yet), doesn't have security filters, it is only a base model that predicts the next word!

Plans for the future

  • Supra-10M - Base, Chat, Reasoning. Trained on RTX 5060 Ti 16GB, leveraging Nvidia technologies and CUDA.
  • Supra-1M - Base, Chat, Reasoning. Trained on GTX 750Ti 4GB, pushing the limits of optimization.

Final thought

SupraLabs is working the most to create the best open source models to you!.

#new #model #open-source