Model Card: Nutral v2.1 Tiny

📌 Model Overview

Nutral v2.1 Tiny is a lightweight decoder-only Transformer language model built using the custom Nutral v2.1 Architecture. The model is designed for educational research, experimentation, chatbot development, and low-resource deployment.

The architecture uses RMSNorm, Multi-Head Self Attention, GELU feed-forward networks, and causal language modeling objectives.

🏗️ Architecture Specifications

Parameter	Value
Model Name	Nutral v2.1 Tiny
Total Parameters	~15.2 Million
Vocabulary Size	50,257
Tokenizer	GPT-2
Hidden Size	256
Transformer Layers	4
Attention Heads	4
Context Length	256 Tokens
Activation Function	GELU
Normalization	RMSNorm
Attention Type	Causal Self-Attention

⚙️ Training Details

Dataset

The model was trained on:

HuggingFaceH4/ultrachat_200k
Split: train_sft

Training Configuration

Setting	Value
Target Training Tokens	~100 Million
Learning Rate	8e-4
Weight Decay	0.01
Optimizer	AdamW Torch Fused
Batch Size	8
Gradient Accumulation	4
Effective Batch Size	32
Precision	FP16
Max Training Steps	6100

🧠 Architecture Components

Embeddings

Token Embeddings
Learned Positional Embeddings

Transformer Block

Each Nutral Block contains:

RMSNorm
Multi-Head Self Attention
Scaled Dot Product Attention
Residual Connections
GELU Feed Forward Network

Output Layer

RMSNorm
Linear Language Modeling Head

🚀 Intended Use

Nutral v2.1 Tiny is suitable for:

Educational AI projects
Research experiments
Lightweight chatbot development
Fine-tuning demonstrations
Open-source AI learning

⚠️ Limitations

Small model size compared to modern LLMs
Limited reasoning capabilities
Limited factual knowledge
Short context window (256 tokens)
May generate inaccurate responses

🛠️ Framework

Built using:

PyTorch
Hugging Face Transformers
Hugging Face Datasets
Accelerate

📜 License

Apache 2.0

🤝 Open Source

Nutral v2.1 Tiny is released as an open-source project for research, education, and community-driven development.

Downloads last month: 65

Safetensors

Model size

28.9M params

Tensor type

F32

Model tree for Nebulixlabs/Nutral-v2.1-Tiny

Base model

Nebulixlabs/Nutral-v1-Tiny

Quantized

Nebulixlabs/Nutral-v2-Tiny

Finetuned

(1)

this model

Nebulixlabs
/

Nutral-v2.1-Tiny