Operations Manager-SLM: Role-Based Small Language Model

A LLaMA-style transformer (~989.0M params, ~0.99B) trained from scratch for the Operations Manager role. Supports up to 5M token context via RoPE with gradient checkpointing.

Architecture

Component	Value
Architecture	LLaMA-style (RoPE + RMSNorm + SwiGLU)
Parameters	~~989.0M (~~0.99B)
Layers	32
Heads	20
Embedding	1600
Max Context	5,000,000 tokens
Max Output	5,000,000 tokens
Vocab	1,625 BPE
Model Size	~4 GB (fp32)

Training

Best eval loss: 3.082059216499329
Trained with gradient checkpointing on Apple M4 (MPS)
3 epochs, batch_size=1, grad_accum=16

Usage

from huggingface_hub import hf_hub_download
from tokenizers import Tokenizer

model_path = hf_hub_download("sathishphdai/operations-manager-slm-5m", "model.safetensors")
tokenizer_path = hf_hub_download("sathishphdai/operations-manager-slm-5m", "operations_manager_tokenizer.json")
tokenizer = Tokenizer.from_file(tokenizer_path)

Downloads last month: 2

Safetensors

Model size

1.0B params

Tensor type

F32