Omk07 commited on
Commit
45698b3
·
verified ·
1 Parent(s): 74a698b

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ tags:
6
+ - cybersecurity
7
+ - llm
8
+ - from-scratch
9
+ - pytorch
10
+ pipeline_tag: text-generation
11
+ ---
12
+
13
+ # CyberLLM-350M
14
+
15
+ A 350M parameter cybersecurity language model built entirely from scratch.
16
+
17
+ ## Model Details
18
+
19
+ - **Architecture**: LLaMA-3 style decoder-only transformer
20
+ - **Parameters**: 303.4M
21
+ - **Training Data**: 5B tokens (3.2B security + general)
22
+ - **Final Loss**: 3.80 (pretrain) → 1.28 (SFT)
23
+ - **Vocab**: 32,000 tokens (custom SentencePiece)
24
+ - **Context**: 2,048 tokens
25
+
26
+ ## Training
27
+
28
+ Pretrained from random initialization on cybersecurity-weighted data including
29
+ Trend Micro Primus-FineWeb, Stack Exchange security sites, ArXiv cs.CR,
30
+ MITRE ATT&CK, NIST SP 800 series, and OWASP documentation.
31
+
32
+ Fine-tuned with 3,750 cybersecurity instruction-response pairs.
33
+
34
+ ## Usage
35
+
36
+ ```python
37
+ # Download and chat
38
+ git clone https://github.com/Omkarth/CyberLLM.git
39
+ cd CyberLLM
40
+
41
+ pip install huggingface_hub torch sentencepiece pyyaml
42
+ python -c "
43
+ from huggingface_hub import hf_hub_download
44
+ hf_hub_download(repo_id='Omk07/CyberLLM-350M', filename='model.pt', local_dir='checkpoints')
45
+ hf_hub_download(repo_id='Omk07/CyberLLM-350M', filename='config.yaml', local_dir='checkpoints')
46
+ hf_hub_download(repo_id='Omk07/CyberLLM-350M', filename='cybersec_tokenizer.model', local_dir='tokenizer')
47
+ "
48
+
49
+ python training/chat.py --model checkpoints/model.pt --question "What is SQL injection?"
50
+ ```
51
+
52
+ ## Limitations
53
+
54
+ 350M parameters is small — handles common security topics but struggles with
55
+ niche technical details. Not a production security tool.
56
+
57
+ ## Author
58
+
59
+ Omkar Thombre — Master of Computer Science, University of Adelaide