Nutral Banner

Model Card: Nutral v2.1 Tiny

πŸ“Œ Model Overview

Nutral v2.1 Tiny is a lightweight decoder-only Transformer language model built using the custom Nutral v2.1 Architecture. The model is designed for educational research, experimentation, chatbot development, and low-resource deployment.

The architecture uses RMSNorm, Multi-Head Self Attention, GELU feed-forward networks, and causal language modeling objectives.


πŸ—οΈ Architecture Specifications

Parameter Value
Model Name Nutral v2.1 Tiny
Total Parameters ~15.2 Million
Vocabulary Size 50,257
Tokenizer GPT-2
Hidden Size 256
Transformer Layers 4
Attention Heads 4
Context Length 256 Tokens
Activation Function GELU
Normalization RMSNorm
Attention Type Causal Self-Attention

βš™οΈ Training Details

Dataset

The model was trained on:

  • HuggingFaceH4/ultrachat_200k
  • Split: train_sft

Training Configuration

Setting Value
Target Training Tokens ~100 Million
Learning Rate 8e-4
Weight Decay 0.01
Optimizer AdamW Torch Fused
Batch Size 8
Gradient Accumulation 4
Effective Batch Size 32
Precision FP16
Max Training Steps 6100

🧠 Architecture Components

Embeddings

  • Token Embeddings
  • Learned Positional Embeddings

Transformer Block

Each Nutral Block contains:

  • RMSNorm
  • Multi-Head Self Attention
  • Scaled Dot Product Attention
  • Residual Connections
  • GELU Feed Forward Network

Output Layer

  • RMSNorm
  • Linear Language Modeling Head

πŸš€ Intended Use

Nutral v2.1 Tiny is suitable for:

  • Educational AI projects
  • Research experiments
  • Lightweight chatbot development
  • Fine-tuning demonstrations
  • Open-source AI learning

⚠️ Limitations

  • Small model size compared to modern LLMs
  • Limited reasoning capabilities
  • Limited factual knowledge
  • Short context window (256 tokens)
  • May generate inaccurate responses

πŸ› οΈ Framework

Built using:

  • PyTorch
  • Hugging Face Transformers
  • Hugging Face Datasets
  • Accelerate

πŸ“œ License

Apache 2.0


🀝 Open Source

Nutral v2.1 Tiny is released as an open-source project for research, education, and community-driven development.

Downloads last month
65
Safetensors
Model size
28.9M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Nebulixlabs/Nutral-v2.1-Tiny

Finetuned
(1)
this model

Dataset used to train Nebulixlabs/Nutral-v2.1-Tiny