SDLC-SLM: Software Development Life Cycle Small Language Model

A LLaMA-style transformer (~33.9M parameters) trained from scratch on a comprehensive Software Development Life Cycle (SDLC) knowledge base. Supports up to 1M token context via RoPE.

Model Description

SDLC-SLM is a domain-specific SLM designed to answer questions about all phases of software development:

Planning & Requirements — SDLC models, user stories, SRS
System Design — Architecture patterns, microservices, API design
Implementation — Coding best practices, SOLID, Git workflows
Testing — Unit/integration/E2E, TDD, test automation
Deployment — CI/CD, containers, Kubernetes, IaC
Maintenance — Monitoring, incident management, SRE
Security — DevSecOps, OWASP, secure coding
Project Management — Agile, Scrum, Kanban

Architecture

Component	Value
Architecture	LLaMA-style (RoPE + RMSNorm + SwiGLU)
Parameters	~33.9M
Layers	8
Attention Heads	8
Embedding Dim	512
Training Context	512 tokens
Max Context (RoPE)	100,000,000,000 tokens
Vocabulary	16,000 BPE tokens
FFN Activation	SwiGLU
Normalization	RMSNorm
Position Encoding	RoPE (theta=50000000000.0)
Best Eval Loss	0.12973198837134986

Training

Data: ~764K tokens from 116 Wikipedia SDLC articles + synthetic Q&A
Tokenizer: 16K BPE trained on domain corpus
Optimizer: AdamW (cosine LR, linear warmup)
Hardware: Apple Silicon MPS

Usage

import torch
from config import cfg
from model import SDLCSLM
from tokenizers import Tokenizer

# Load
checkpoint = torch.load("pytorch_model.bin", map_location="cpu")
model = SDLCSLM()
model.load_state_dict(checkpoint)
model.eval()

tokenizer = Tokenizer.from_file("tokenizer.json")

# Generate
prompt = "<bos><|system|>You are SDLC-SLM.<|user|>What is CI/CD?<|assistant|>"
ids = torch.tensor([tokenizer.encode(prompt).ids])
out = model.generate(ids, max_new_tokens=256, temperature=0.8, top_k=50)
print(tokenizer.decode(out[0].tolist()))

Limitations

Nano-scale model for educational/demonstration purposes
Best suited for SDLC domain questions only
Responses may be repetitive due to small model size
Not suitable for production without further fine-tuning

License

MIT — Built entirely from scratch (architecture, tokenizer, data, training loop).

Downloads last month: 3

Safetensors

Model size

42.1M params

Tensor type

F32