| --- |
| language: |
| - en |
| license: mit |
| tags: |
| - cybersecurity |
| - llm |
| - from-scratch |
| - pytorch |
| pipeline_tag: text-generation |
| --- |
| |
| # CyberLLM-350M |
|
|
| A 350M parameter cybersecurity language model built entirely from scratch. |
|
|
| ## Model Details |
|
|
| - **Architecture**: LLaMA-3 style decoder-only transformer |
| - **Parameters**: 303.4M |
| - **Training Data**: 5B tokens (3.2B security + general) |
| - **Final Loss**: 3.80 (pretrain) → 1.28 (SFT) |
| - **Vocab**: 32,000 tokens (custom SentencePiece) |
| - **Context**: 2,048 tokens |
|
|
| ## Training |
|
|
| Pretrained from random initialization on cybersecurity-weighted data including |
| Trend Micro Primus-FineWeb, Stack Exchange security sites, ArXiv cs.CR, |
| MITRE ATT&CK, NIST SP 800 series, and OWASP documentation. |
|
|
| Fine-tuned with 3,750 cybersecurity instruction-response pairs. |
|
|
| ## Usage |
|
|
| ```python |
| # Download and chat |
| git clone https://github.com/Omkarth/CyberLLM.git |
| cd CyberLLM |
| |
| pip install huggingface_hub torch sentencepiece pyyaml |
| python -c " |
| from huggingface_hub import hf_hub_download |
| hf_hub_download(repo_id='Omk07/CyberLLM-350M', filename='model.pt', local_dir='checkpoints') |
| hf_hub_download(repo_id='Omk07/CyberLLM-350M', filename='config.yaml', local_dir='checkpoints') |
| hf_hub_download(repo_id='Omk07/CyberLLM-350M', filename='cybersec_tokenizer.model', local_dir='tokenizer') |
| " |
| |
| python training/chat.py --model checkpoints/model.pt --question "What is SQL injection?" |
| ``` |
|
|
| ## Limitations |
|
|
| 350M parameters is small — handles common security topics but struggles with |
| niche technical details. Not a production security tool. |
|
|
| ## Author |
|
|
| Omkar Thombre — Master of Computer Science, University of Adelaide |
|
|