| | --- |
| | license: mit |
| | language: |
| | - en |
| | auto_detected: true |
| | datasets: |
| | - Canstralian/pentesting_dataset |
| | - Canstralian/Wordlists |
| | - Canstralian/ShellCommands |
| | - Canstralian/CyberExploitDB |
| | - Chemically-motivated/CyberSecurityDataset |
| | - Chemically-motivated/AI-Agent-Generating-Tool-Debugging-Prompt-Library |
| | metrics: |
| | - accuracy |
| | - precision |
| | - f1 |
| | - code_eval |
| | base_model: |
| | - WhiteRabbitNeo/WhiteRabbitNeo-33B-v1.5 |
| | library_name: transformers |
| | tags: |
| | - code |
| | --- |
| | |
| | # CyberAttackDetection |
| |
|
| | ## Overview |
| |
|
| | The **CyberAttackDetection** model is a fine-tuned BERT-based sequence classification model designed to identify cyberattacks in textual descriptions. It classifies input data into two categories: |
| | - **Attack (1)**: The text describes a cybersecurity threat or attack. |
| | - **Non-Attack (0)**: The text does not describe a cybersecurity threat. |
| |
|
| | --- |
| |
|
| | ## Model Details |
| |
|
| | - **License**: [MIT License](LICENSE) |
| | - **Datasets**: |
| | - Custom cybersecurity datasets: |
| | - `Canstralian/pentesting_dataset` |
| | - `Canstralian/Wordlists` |
| | - `Canstralian/ShellCommands` |
| | - `Canstralian/CyberExploitDB` |
| | - `Chemically-motivated/CyberSecurityDataset` |
| | - `Chemically-motivated/AI-Agent-Generating-Tool-Debugging-Prompt-Library` |
| | - **Language**: English |
| | - **Metrics**: |
| | - **Accuracy**: 85% |
| | - **F1 Score**: 0.83 |
| | - **Precision**: 0.80 |
| | - **Recall**: 0.87 |
| | - **Base Model**: `WhiteRabbitNeo/WhiteRabbitNeo-33B-v1.5` |
| | - **Pipeline Tag**: `text-classification` |
| | - **Library Name**: `transformers` |
| | - **Tags**: `cybersecurity`, `text-classification`, `attack-detection`, `BERT` |
| | - **New Version**: `v1.0.0` |
| | - **Auto-Detected Features**: True |
| |
|
| | --- |
| |
|
| | ## Model Usage |
| |
|
| | ### Installation |
| | Before using the model, ensure the necessary dependencies are installed: |
| | ```bash |
| | pip install transformers torch |
| | ``` |
| |
|
| | ### Example Code |
| | Use the following Python code to load the model and classify a sample text: |
| |
|
| | ```python |
| | from transformers import AutoModelForSequenceClassification, AutoTokenizer |
| | |
| | # Load the fine-tuned model and tokenizer |
| | model = AutoModelForSequenceClassification.from_pretrained("Canstralian/CyberAttackDetection") |
| | tokenizer = AutoTokenizer.from_pretrained("Canstralian/CyberAttackDetection") |
| | |
| | # Example input: Cyberattack description |
| | text = "A vulnerability was discovered in the server software." |
| | |
| | # Tokenize the input |
| | inputs = tokenizer(text, return_tensors="pt") |
| | |
| | # Get model predictions |
| | outputs = model(**inputs) |
| | |
| | # Predict the label (1 = attack, 0 = non-attack) |
| | prediction = outputs.logits.argmax(dim=-1) |
| | print(f"Prediction: {'Attack' if prediction.item() == 1 else 'Non-Attack'}") |
| | ``` |
| |
|
| | ## Prompts: |
| | - Open Ports: "Analyze the following network scan report and identify open ports and their associated vulnerabilities. Suggest best practices to secure these ports: [Insert network scan report]." |
| | - Outdated Software or Services: "Given this list of installed software and services, identify outdated versions and known vulnerabilities. Provide recommendations for updates or patches to mitigate risks: [Insert software and service list]." |
| | - Default Credentials: "Scan the following system configurations for any use of default credentials. Provide a list of affected services and recommendations for securing these credentials: [Insert system configuration details]." |
| | - Misconfigurations: "Evaluate the provided system configuration for potential misconfigurations. Highlight risks and provide recommendations for secure setup: [Insert system configuration details]." |
| | - Injection Flaws: "Review the given web application code or request logs and identify potential injection vulnerabilities such as SQL injection, command injection, or XSS. Provide remediation steps: [Insert code or logs]." |
| | - Unencrypted Services: "Analyze the following network configuration and identify services that are transmitting data without encryption. Suggest strategies to enforce secure transmission: [Insert network configuration details]." |
| | - Known Software Vulnerabilities: "Review the provided software inventory and cross-reference it with known vulnerabilities in the National Vulnerability Database (NVD). Recommend patches or workarounds: [Insert software inventory]." |
| | - Cross-Site Request Forgery (CSRF): "Examine the provided web application code for potential CSRF vulnerabilities. Suggest specific coding or configuration techniques to prevent these attacks: [Insert code]." |
| | - Insecure Direct Object References (IDOR): "Analyze the provided API endpoints and their associated access controls. Identify any IDOR vulnerabilities and suggest secure implementation strategies: [Insert API endpoint details]." |
| | - Security Misconfigurations in Web Servers/Applications: "Assess the given web server configuration for security misconfigurations, such as improper HTTP headers or verbose error messages. Recommend changes to harden the server: [Insert server configuration]." |
| | - Broken Authentication and Session Management: "Review the provided authentication and session management implementation. Identify weaknesses and recommend strategies to prevent compromise: [Insert authentication/session management details]." |
| | - Sensitive Data Exposure: "Analyze the system's data handling processes and storage practices to identify potential sensitive data exposure. Recommend measures to protect sensitive information: [Insert system details]." |
| | - API Vulnerabilities: "Examine the following API documentation and implementation for vulnerabilities, including insecure endpoints and data leakage. Provide recommendations for securing the API: [Insert API documentation]." |
| | - Denial of Service (DoS) Vulnerabilities: "Review the system's architecture and configuration for potential vulnerabilities to DoS attacks. Suggest mitigation strategies such as rate limiting and load balancing: [Insert system architecture]." |
| | - Buffer Overflows: "Analyze the provided code or application for buffer overflow vulnerabilities. Highlight potential weak points and recommend secure coding practices to prevent exploitation: [Insert code]." |
| |
|
| |
|
| | ## Model Training Details |
| |
|
| | ### Training Objective |
| | The model was fine-tuned to classify descriptive text as either an attack or non-attack event. It uses a **binary classification** approach. |
| |
|
| | ### Training Data |
| | - The training data includes cybersecurity-related attack descriptions and non-attack examples from curated datasets. |
| |
|
| | --- |
| |
|
| | ## Evaluation |
| |
|
| | The model was evaluated on a balanced test set using the following metrics: |
| | - **Accuracy**: 85% |
| | - **F1 Score**: 0.83 |
| | - **Precision**: 0.80 |
| | - **Recall**: 0.87 |
| |
|
| | These results indicate strong performance in detecting cyberattacks from text. |
| |
|
| | --- |
| |
|
| | ## License |
| |
|
| | This project is licensed under the **MIT License**. Refer to the [LICENSE](LICENSE) file for details. |
| |
|
| | --- |
| |
|
| | ## How to Contribute |
| |
|
| | We welcome contributions! |
| | - **Submit Issues**: If you encounter problems, open an issue on the repository. |
| | - **Pull Requests**: Feel free to contribute code improvements or documentation updates. |
| |
|
| | --- |
| |
|
| | ## Contact |
| |
|
| | For further information or inquiries, contact: **canstralian@cybersecurity.com** |