SDLC-SLM: Software Development Life Cycle Small Language Model
A LLaMA-style transformer (~33.9M parameters) trained from scratch on a comprehensive Software Development Life Cycle (SDLC) knowledge base. Supports up to 1M token context via RoPE.
Model Description
SDLC-SLM is a domain-specific SLM designed to answer questions about all phases of software development:
- Planning & Requirements β SDLC models, user stories, SRS
- System Design β Architecture patterns, microservices, API design
- Implementation β Coding best practices, SOLID, Git workflows
- Testing β Unit/integration/E2E, TDD, test automation
- Deployment β CI/CD, containers, Kubernetes, IaC
- Maintenance β Monitoring, incident management, SRE
- Security β DevSecOps, OWASP, secure coding
- Project Management β Agile, Scrum, Kanban
Architecture
| Component | Value |
|---|---|
| Architecture | LLaMA-style (RoPE + RMSNorm + SwiGLU) |
| Parameters | ~33.9M |
| Layers | 8 |
| Attention Heads | 8 |
| Embedding Dim | 512 |
| Training Context | 512 tokens |
| Max Context (RoPE) | 1,000,000 tokens |
| Vocabulary | 16,000 BPE tokens |
| FFN Activation | SwiGLU |
| Normalization | RMSNorm |
| Position Encoding | RoPE (theta=500000.0) |
| Best Eval Loss | 0.14101302344352007 |
Training
- Data: ~764K tokens from 116 Wikipedia SDLC articles + synthetic Q&A
- Tokenizer: 16K BPE trained on domain corpus
- Optimizer: AdamW (cosine LR, linear warmup)
- Hardware: Apple Silicon MPS
Usage
import torch
from config import cfg
from model import SDLCSLM
from tokenizers import Tokenizer
# Load
checkpoint = torch.load("pytorch_model.bin", map_location="cpu")
model = SDLCSLM()
model.load_state_dict(checkpoint)
model.eval()
tokenizer = Tokenizer.from_file("tokenizer.json")
# Generate
prompt = "<bos><|system|>You are SDLC-SLM.<|user|>What is CI/CD?<|assistant|>"
ids = torch.tensor([tokenizer.encode(prompt).ids])
out = model.generate(ids, max_new_tokens=256, temperature=0.8, top_k=50)
print(tokenizer.decode(out[0].tolist()))
Limitations
- Nano-scale model for educational/demonstration purposes
- Best suited for SDLC domain questions only
- Responses may be repetitive due to small model size
- Not suitable for production without further fine-tuning
License
MIT β Built entirely from scratch (architecture, tokenizer, data, training loop).
- Downloads last month
- -