File size: 1,639 Bytes
45698b3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 | ---
language:
- en
license: mit
tags:
- cybersecurity
- llm
- from-scratch
- pytorch
pipeline_tag: text-generation
---
# CyberLLM-350M
A 350M parameter cybersecurity language model built entirely from scratch.
## Model Details
- **Architecture**: LLaMA-3 style decoder-only transformer
- **Parameters**: 303.4M
- **Training Data**: 5B tokens (3.2B security + general)
- **Final Loss**: 3.80 (pretrain) → 1.28 (SFT)
- **Vocab**: 32,000 tokens (custom SentencePiece)
- **Context**: 2,048 tokens
## Training
Pretrained from random initialization on cybersecurity-weighted data including
Trend Micro Primus-FineWeb, Stack Exchange security sites, ArXiv cs.CR,
MITRE ATT&CK, NIST SP 800 series, and OWASP documentation.
Fine-tuned with 3,750 cybersecurity instruction-response pairs.
## Usage
```python
# Download and chat
git clone https://github.com/Omkarth/CyberLLM.git
cd CyberLLM
pip install huggingface_hub torch sentencepiece pyyaml
python -c "
from huggingface_hub import hf_hub_download
hf_hub_download(repo_id='Omk07/CyberLLM-350M', filename='model.pt', local_dir='checkpoints')
hf_hub_download(repo_id='Omk07/CyberLLM-350M', filename='config.yaml', local_dir='checkpoints')
hf_hub_download(repo_id='Omk07/CyberLLM-350M', filename='cybersec_tokenizer.model', local_dir='tokenizer')
"
python training/chat.py --model checkpoints/model.pt --question "What is SQL injection?"
```
## Limitations
350M parameters is small — handles common security topics but struggles with
niche technical details. Not a production security tool.
## Author
Omkar Thombre — Master of Computer Science, University of Adelaide
|