Younis2003 commited on
Commit
f6b5b49
·
verified ·
1 Parent(s): 4745881

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +132 -3
README.md CHANGED
@@ -1,3 +1,132 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - Younis2003/secure_dataset_cvefixes
5
+ language:
6
+ - en
7
+ library_name: transformers
8
+ base_model: meta-llama/CodeLlama-13b-hf
9
+ tags:
10
+ - cybersecurity
11
+ - code-security
12
+ - vulnerability-detection
13
+ - secure-code
14
+ - codellama
15
+ - transformers
16
+ ---
17
+
18
+ # CodeLlama_for_code_security
19
+
20
+ ## Overview
21
+
22
+ CodeLlama_for_code_security is a fine-tuned large language model designed for **vulnerability detection and secure code remediation**.
23
+
24
+ The model analyzes vulnerable source code and generates structured outputs describing detected vulnerabilities and proposing secure fixes.
25
+
26
+ This model is built on top of **CodeLlama-13B** and fine-tuned using vulnerability datasets to specialize in secure code analysis tasks.
27
+
28
+ ---
29
+
30
+ ## Intended Use
31
+
32
+ This model is intended for:
33
+
34
+ - Secure code analysis
35
+ - Vulnerability identification
36
+ - Automatic code remediation suggestions
37
+ - Security-focused code review assistance
38
+ - Educational purposes in secure software development
39
+
40
+ ### Example Use Cases
41
+
42
+ - Detecting vulnerabilities in open-source projects
43
+ - Assisting developers in secure coding practices
44
+ - Research in AI-driven cybersecurity tools
45
+
46
+ ---
47
+
48
+ ## Training Data
49
+
50
+ The model was fine-tuned using curated vulnerability datasets including:
51
+
52
+ - CVE vulnerability descriptions
53
+ - CWE vulnerability classifications
54
+ - Code vulnerability datasets
55
+ - Security patch examples
56
+
57
+ Dataset used for fine-tuning:
58
+
59
+ **secure_dataset_cvefixes**
60
+
61
+ The dataset focuses on real-world software vulnerabilities and their corresponding secure fixes.
62
+
63
+ ---
64
+
65
+ ## Model Details
66
+
67
+ Base Model: CodeLlama-13B
68
+ Architecture: Transformer-based causal language model
69
+ Fine-tuning Method: Supervised Fine-Tuning (SFT)
70
+
71
+ The model processes vulnerable code snippets and produces structured outputs that include:
72
+
73
+ - vulnerability identification
74
+ - vulnerability classification
75
+ - explanation of the vulnerability
76
+ - secure code remediation
77
+
78
+ ---
79
+
80
+ ## Evaluation Results
81
+
82
+ The model was evaluated using **semantic similarity between generated outputs and ground truth secure fixes**.
83
+
84
+ Evaluation metric used:
85
+
86
+ **Embedding Similarity**
87
+
88
+ | Metric | Score |
89
+ |------|------|
90
+ | Embedding Similarity | **0.9643** |
91
+
92
+ This corresponds to approximately **96% semantic similarity** between generated remediation outputs and the expected secure fixes.
93
+
94
+ ---
95
+
96
+ ## Example Usage
97
+
98
+ ```python
99
+ from transformers import AutoModelForCasualLM
100
+ model_name = "Younis2003/CodeLlama_for_code_security"
101
+
102
+ model = AutoModelForCasualLM.from_pretrained(model_name , device_map = "auto")
103
+ ```
104
+
105
+ ### Limitations
106
+
107
+ The model may not detect all vulnerabilities.
108
+
109
+ Results should always be reviewed by a security expert.
110
+
111
+ The model may generate incorrect fixes in complex systems.
112
+
113
+ This model is intended as a security assistant, not a replacement for professional security auditing.
114
+
115
+ ### Ethical Considerations
116
+
117
+ The model is designed for defensive cybersecurity applications.
118
+ It should not be used for malicious activities.
119
+
120
+ ### License
121
+
122
+ This model follows the Apache 2.0 license and respects the licensing terms of the base model CodeLlama.
123
+
124
+ ### Author
125
+
126
+ Developed by Younis Alshibli as part of an AI research project focusing on:
127
+
128
+ AI-driven vulnerability detection
129
+
130
+ automated secure code remediation
131
+
132
+ Intelligent security analysis systems