|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
auto_detected: true |
|
|
datasets: |
|
|
- Canstralian/pentesting_dataset |
|
|
- Canstralian/Wordlists |
|
|
- Canstralian/ShellCommands |
|
|
- Canstralian/CyberExploitDB |
|
|
- Chemically-motivated/CyberSecurityDataset |
|
|
- Chemically-motivated/AI-Agent-Generating-Tool-Debugging-Prompt-Library |
|
|
metrics: |
|
|
- accuracy |
|
|
- precision |
|
|
- f1 |
|
|
- code_eval |
|
|
base_model: |
|
|
- WhiteRabbitNeo/WhiteRabbitNeo-33B-v1.5 |
|
|
library_name: transformers |
|
|
tags: |
|
|
- code |
|
|
--- |
|
|
|
|
|
# CyberAttackDetection |
|
|
|
|
|
## Overview |
|
|
|
|
|
The **CyberAttackDetection** model is a fine-tuned BERT-based sequence classification model designed to identify cyberattacks in textual descriptions. It classifies input data into two categories: |
|
|
- **Attack (1)**: The text describes a cybersecurity threat or attack. |
|
|
- **Non-Attack (0)**: The text does not describe a cybersecurity threat. |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **License**: [MIT License](LICENSE) |
|
|
- **Datasets**: |
|
|
- Custom cybersecurity datasets: |
|
|
- `Canstralian/pentesting_dataset` |
|
|
- `Canstralian/Wordlists` |
|
|
- `Canstralian/ShellCommands` |
|
|
- `Canstralian/CyberExploitDB` |
|
|
- `Chemically-motivated/CyberSecurityDataset` |
|
|
- `Chemically-motivated/AI-Agent-Generating-Tool-Debugging-Prompt-Library` |
|
|
- **Language**: English |
|
|
- **Metrics**: |
|
|
- **Accuracy**: 85% |
|
|
- **F1 Score**: 0.83 |
|
|
- **Precision**: 0.80 |
|
|
- **Recall**: 0.87 |
|
|
- **Base Model**: `WhiteRabbitNeo/WhiteRabbitNeo-33B-v1.5` |
|
|
- **Pipeline Tag**: `text-classification` |
|
|
- **Library Name**: `transformers` |
|
|
- **Tags**: `cybersecurity`, `text-classification`, `attack-detection`, `BERT` |
|
|
- **New Version**: `v1.0.0` |
|
|
- **Auto-Detected Features**: True |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Usage |
|
|
|
|
|
### Installation |
|
|
Before using the model, ensure the necessary dependencies are installed: |
|
|
```bash |
|
|
pip install transformers torch |
|
|
``` |
|
|
|
|
|
### Example Code |
|
|
Use the following Python code to load the model and classify a sample text: |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
|
|
|
|
# Load the fine-tuned model and tokenizer |
|
|
model = AutoModelForSequenceClassification.from_pretrained("Canstralian/CyberAttackDetection") |
|
|
tokenizer = AutoTokenizer.from_pretrained("Canstralian/CyberAttackDetection") |
|
|
|
|
|
# Example input: Cyberattack description |
|
|
text = "A vulnerability was discovered in the server software." |
|
|
|
|
|
# Tokenize the input |
|
|
inputs = tokenizer(text, return_tensors="pt") |
|
|
|
|
|
# Get model predictions |
|
|
outputs = model(**inputs) |
|
|
|
|
|
# Predict the label (1 = attack, 0 = non-attack) |
|
|
prediction = outputs.logits.argmax(dim=-1) |
|
|
print(f"Prediction: {'Attack' if prediction.item() == 1 else 'Non-Attack'}") |
|
|
``` |
|
|
|
|
|
## Prompts: |
|
|
- Open Ports: "Analyze the following network scan report and identify open ports and their associated vulnerabilities. Suggest best practices to secure these ports: [Insert network scan report]." |
|
|
- Outdated Software or Services: "Given this list of installed software and services, identify outdated versions and known vulnerabilities. Provide recommendations for updates or patches to mitigate risks: [Insert software and service list]." |
|
|
- Default Credentials: "Scan the following system configurations for any use of default credentials. Provide a list of affected services and recommendations for securing these credentials: [Insert system configuration details]." |
|
|
- Misconfigurations: "Evaluate the provided system configuration for potential misconfigurations. Highlight risks and provide recommendations for secure setup: [Insert system configuration details]." |
|
|
- Injection Flaws: "Review the given web application code or request logs and identify potential injection vulnerabilities such as SQL injection, command injection, or XSS. Provide remediation steps: [Insert code or logs]." |
|
|
- Unencrypted Services: "Analyze the following network configuration and identify services that are transmitting data without encryption. Suggest strategies to enforce secure transmission: [Insert network configuration details]." |
|
|
- Known Software Vulnerabilities: "Review the provided software inventory and cross-reference it with known vulnerabilities in the National Vulnerability Database (NVD). Recommend patches or workarounds: [Insert software inventory]." |
|
|
- Cross-Site Request Forgery (CSRF): "Examine the provided web application code for potential CSRF vulnerabilities. Suggest specific coding or configuration techniques to prevent these attacks: [Insert code]." |
|
|
- Insecure Direct Object References (IDOR): "Analyze the provided API endpoints and their associated access controls. Identify any IDOR vulnerabilities and suggest secure implementation strategies: [Insert API endpoint details]." |
|
|
- Security Misconfigurations in Web Servers/Applications: "Assess the given web server configuration for security misconfigurations, such as improper HTTP headers or verbose error messages. Recommend changes to harden the server: [Insert server configuration]." |
|
|
- Broken Authentication and Session Management: "Review the provided authentication and session management implementation. Identify weaknesses and recommend strategies to prevent compromise: [Insert authentication/session management details]." |
|
|
- Sensitive Data Exposure: "Analyze the system's data handling processes and storage practices to identify potential sensitive data exposure. Recommend measures to protect sensitive information: [Insert system details]." |
|
|
- API Vulnerabilities: "Examine the following API documentation and implementation for vulnerabilities, including insecure endpoints and data leakage. Provide recommendations for securing the API: [Insert API documentation]." |
|
|
- Denial of Service (DoS) Vulnerabilities: "Review the system's architecture and configuration for potential vulnerabilities to DoS attacks. Suggest mitigation strategies such as rate limiting and load balancing: [Insert system architecture]." |
|
|
- Buffer Overflows: "Analyze the provided code or application for buffer overflow vulnerabilities. Highlight potential weak points and recommend secure coding practices to prevent exploitation: [Insert code]." |
|
|
|
|
|
|
|
|
## Model Training Details |
|
|
|
|
|
### Training Objective |
|
|
The model was fine-tuned to classify descriptive text as either an attack or non-attack event. It uses a **binary classification** approach. |
|
|
|
|
|
### Training Data |
|
|
- The training data includes cybersecurity-related attack descriptions and non-attack examples from curated datasets. |
|
|
|
|
|
--- |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
The model was evaluated on a balanced test set using the following metrics: |
|
|
- **Accuracy**: 85% |
|
|
- **F1 Score**: 0.83 |
|
|
- **Precision**: 0.80 |
|
|
- **Recall**: 0.87 |
|
|
|
|
|
These results indicate strong performance in detecting cyberattacks from text. |
|
|
|
|
|
--- |
|
|
|
|
|
## License |
|
|
|
|
|
This project is licensed under the **MIT License**. Refer to the [LICENSE](LICENSE) file for details. |
|
|
|
|
|
--- |
|
|
|
|
|
## How to Contribute |
|
|
|
|
|
We welcome contributions! |
|
|
- **Submit Issues**: If you encounter problems, open an issue on the repository. |
|
|
- **Pull Requests**: Feel free to contribute code improvements or documentation updates. |
|
|
|
|
|
--- |
|
|
|
|
|
## Contact |
|
|
|
|
|
For further information or inquiries, contact: **canstralian@cybersecurity.com** |