Update README.md
Browse files
README.md
CHANGED
|
@@ -2,6 +2,7 @@
|
|
| 2 |
license: mit
|
| 3 |
language:
|
| 4 |
- en
|
|
|
|
| 5 |
datasets:
|
| 6 |
- Canstralian/pentesting_dataset
|
| 7 |
- Canstralian/Wordlists
|
|
@@ -9,41 +10,63 @@ datasets:
|
|
| 9 |
- Canstralian/CyberExploitDB
|
| 10 |
- Chemically-motivated/CyberSecurityDataset
|
| 11 |
- Chemically-motivated/AI-Agent-Generating-Tool-Debugging-Prompt-Library
|
| 12 |
-
base_model:
|
| 13 |
-
- WhiteRabbitNeo/WhiteRabbitNeo-33B-v1.5
|
| 14 |
-
library_name: transformers
|
| 15 |
metrics:
|
| 16 |
- accuracy
|
| 17 |
-
- code_eval
|
| 18 |
-
- f1
|
| 19 |
- precision
|
| 20 |
-
-
|
|
|
|
|
|
|
|
|
|
| 21 |
---
|
| 22 |
|
| 23 |
# CyberAttackDetection
|
| 24 |
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
## Model Details
|
| 28 |
|
| 29 |
-
- **
|
| 30 |
-
- **
|
| 31 |
-
-
|
| 32 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
|
| 34 |
-
|
| 35 |
|
| 36 |
-
|
| 37 |
|
| 38 |
-
###
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
-
|
|
|
|
| 41 |
|
| 42 |
-
|
| 43 |
-
pip install -r requirements.txt
|
| 44 |
-
```
|
| 45 |
-
### Example Usage
|
| 46 |
-
```python
|
| 47 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
| 48 |
|
| 49 |
# Load the fine-tuned model and tokenizer
|
|
@@ -61,19 +84,47 @@ outputs = model(**inputs)
|
|
| 61 |
|
| 62 |
# Predict the label (1 = attack, 0 = non-attack)
|
| 63 |
prediction = outputs.logits.argmax(dim=-1)
|
| 64 |
-
print(f"Prediction: {'Attack' if prediction.item() == 1 else 'Non-
|
| 65 |
-
|
|
|
|
|
|
|
|
|
|
| 66 |
## Model Training Details
|
| 67 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 68 |
|
| 69 |
## Evaluation
|
| 70 |
-
|
| 71 |
-
The model was evaluated on a test set
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
## License
|
| 73 |
-
|
|
|
|
|
|
|
|
|
|
| 74 |
|
| 75 |
## How to Contribute
|
| 76 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 77 |
|
| 78 |
## Contact
|
| 79 |
-
|
|
|
|
|
|
| 2 |
license: mit
|
| 3 |
language:
|
| 4 |
- en
|
| 5 |
+
auto_detected: true
|
| 6 |
datasets:
|
| 7 |
- Canstralian/pentesting_dataset
|
| 8 |
- Canstralian/Wordlists
|
|
|
|
| 10 |
- Canstralian/CyberExploitDB
|
| 11 |
- Chemically-motivated/CyberSecurityDataset
|
| 12 |
- Chemically-motivated/AI-Agent-Generating-Tool-Debugging-Prompt-Library
|
|
|
|
|
|
|
|
|
|
| 13 |
metrics:
|
| 14 |
- accuracy
|
|
|
|
|
|
|
| 15 |
- precision
|
| 16 |
+
- f1
|
| 17 |
+
- code_eval
|
| 18 |
+
base_model:
|
| 19 |
+
- WhiteRabbitNeo/WhiteRabbitNeo-33B-v1.5
|
| 20 |
---
|
| 21 |
|
| 22 |
# CyberAttackDetection
|
| 23 |
|
| 24 |
+
## Overview
|
| 25 |
+
|
| 26 |
+
The **CyberAttackDetection** model is a fine-tuned BERT-based sequence classification model designed to identify cyberattacks in textual descriptions. It classifies input data into two categories:
|
| 27 |
+
- **Attack (1)**: The text describes a cybersecurity threat or attack.
|
| 28 |
+
- **Non-Attack (0)**: The text does not describe a cybersecurity threat.
|
| 29 |
+
|
| 30 |
+
---
|
| 31 |
|
| 32 |
## Model Details
|
| 33 |
|
| 34 |
+
- **License**: [MIT License](LICENSE)
|
| 35 |
+
- **Datasets**:
|
| 36 |
+
- Custom cybersecurity datasets:
|
| 37 |
+
- `Canstralian/pentesting_dataset`
|
| 38 |
+
- `Canstralian/Wordlists`
|
| 39 |
+
- `Canstralian/ShellCommands`
|
| 40 |
+
- `Canstralian/CyberExploitDB`
|
| 41 |
+
- `Chemically-motivated/CyberSecurityDataset`
|
| 42 |
+
- `Chemically-motivated/AI-Agent-Generating-Tool-Debugging-Prompt-Library`
|
| 43 |
+
- **Language**: English
|
| 44 |
+
- **Metrics**:
|
| 45 |
+
- **Accuracy**: 85%
|
| 46 |
+
- **F1 Score**: 0.83
|
| 47 |
+
- **Precision**: 0.80
|
| 48 |
+
- **Recall**: 0.87
|
| 49 |
+
- **Base Model**: `WhiteRabbitNeo/WhiteRabbitNeo-33B-v1.5`
|
| 50 |
+
- **Pipeline Tag**: `text-classification`
|
| 51 |
+
- **Library Name**: `transformers`
|
| 52 |
+
- **Tags**: `cybersecurity`, `text-classification`, `attack-detection`, `BERT`
|
| 53 |
+
- **New Version**: `v1.0.0`
|
| 54 |
+
- **Auto-Detected Features**: True
|
| 55 |
|
| 56 |
+
---
|
| 57 |
|
| 58 |
+
## Model Usage
|
| 59 |
|
| 60 |
+
### Installation
|
| 61 |
+
Before using the model, ensure the necessary dependencies are installed:
|
| 62 |
+
```bash
|
| 63 |
+
pip install transformers torch
|
| 64 |
+
```
|
| 65 |
|
| 66 |
+
### Example Code
|
| 67 |
+
Use the following Python code to load the model and classify a sample text:
|
| 68 |
|
| 69 |
+
```python
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
| 71 |
|
| 72 |
# Load the fine-tuned model and tokenizer
|
|
|
|
| 84 |
|
| 85 |
# Predict the label (1 = attack, 0 = non-attack)
|
| 86 |
prediction = outputs.logits.argmax(dim=-1)
|
| 87 |
+
print(f"Prediction: {'Attack' if prediction.item() == 1 else 'Non-Attack'}")
|
| 88 |
+
```
|
| 89 |
+
|
| 90 |
+
---
|
| 91 |
+
|
| 92 |
## Model Training Details
|
| 93 |
+
|
| 94 |
+
### Training Objective
|
| 95 |
+
The model was fine-tuned to classify descriptive text as either an attack or non-attack event. It uses a **binary classification** approach.
|
| 96 |
+
|
| 97 |
+
### Training Data
|
| 98 |
+
- The training data includes cybersecurity-related attack descriptions and non-attack examples from curated datasets.
|
| 99 |
+
|
| 100 |
+
---
|
| 101 |
|
| 102 |
## Evaluation
|
| 103 |
+
|
| 104 |
+
The model was evaluated on a balanced test set using the following metrics:
|
| 105 |
+
- **Accuracy**: 85%
|
| 106 |
+
- **F1 Score**: 0.83
|
| 107 |
+
- **Precision**: 0.80
|
| 108 |
+
- **Recall**: 0.87
|
| 109 |
+
|
| 110 |
+
These results indicate strong performance in detecting cyberattacks from text.
|
| 111 |
+
|
| 112 |
+
---
|
| 113 |
+
|
| 114 |
## License
|
| 115 |
+
|
| 116 |
+
This project is licensed under the **MIT License**. Refer to the [LICENSE](LICENSE) file for details.
|
| 117 |
+
|
| 118 |
+
---
|
| 119 |
|
| 120 |
## How to Contribute
|
| 121 |
+
|
| 122 |
+
We welcome contributions!
|
| 123 |
+
- **Submit Issues**: If you encounter problems, open an issue on the repository.
|
| 124 |
+
- **Pull Requests**: Feel free to contribute code improvements or documentation updates.
|
| 125 |
+
|
| 126 |
+
---
|
| 127 |
|
| 128 |
## Contact
|
| 129 |
+
|
| 130 |
+
For further information or inquiries, contact: **canstralian@cybersecurity.com**
|