SDLC-SLM: Software Development Life Cycle Small Language Model

A LLaMA-style transformer (~33.9M parameters) trained from scratch on a comprehensive Software Development Life Cycle (SDLC) knowledge base. Supports up to 1M token context via RoPE.

Model Description

SDLC-SLM is a domain-specific SLM designed to answer questions about all phases of software development:

  • Planning & Requirements β€” SDLC models, user stories, SRS
  • System Design β€” Architecture patterns, microservices, API design
  • Implementation β€” Coding best practices, SOLID, Git workflows
  • Testing β€” Unit/integration/E2E, TDD, test automation
  • Deployment β€” CI/CD, containers, Kubernetes, IaC
  • Maintenance β€” Monitoring, incident management, SRE
  • Security β€” DevSecOps, OWASP, secure coding
  • Project Management β€” Agile, Scrum, Kanban

Architecture

Component Value
Architecture LLaMA-style (RoPE + RMSNorm + SwiGLU)
Parameters ~33.9M
Layers 8
Attention Heads 8
Embedding Dim 512
Training Context 512 tokens
Max Context (RoPE) 1,000,000 tokens
Vocabulary 16,000 BPE tokens
FFN Activation SwiGLU
Normalization RMSNorm
Position Encoding RoPE (theta=500000.0)
Best Eval Loss 0.14101302344352007

Training

  • Data: ~764K tokens from 116 Wikipedia SDLC articles + synthetic Q&A
  • Tokenizer: 16K BPE trained on domain corpus
  • Optimizer: AdamW (cosine LR, linear warmup)
  • Hardware: Apple Silicon MPS

Usage

import torch
from config import cfg
from model import SDLCSLM
from tokenizers import Tokenizer

# Load
checkpoint = torch.load("pytorch_model.bin", map_location="cpu")
model = SDLCSLM()
model.load_state_dict(checkpoint)
model.eval()

tokenizer = Tokenizer.from_file("tokenizer.json")

# Generate
prompt = "<bos><|system|>You are SDLC-SLM.<|user|>What is CI/CD?<|assistant|>"
ids = torch.tensor([tokenizer.encode(prompt).ids])
out = model.generate(ids, max_new_tokens=256, temperature=0.8, top_k=50)
print(tokenizer.decode(out[0].tolist()))

Limitations

  • Nano-scale model for educational/demonstration purposes
  • Best suited for SDLC domain questions only
  • Responses may be repetitive due to small model size
  • Not suitable for production without further fine-tuning

License

MIT β€” Built entirely from scratch (architecture, tokenizer, data, training loop).

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support