| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | library_name: transformers |
| | tags: |
| | - cybersecurity |
| | - soc-automation |
| | - siem |
| | - suricata |
| | - security-operations |
| | - threat-detection |
| | - alert-triage |
| | - lora |
| | - qwen2.5 |
| | pipeline_tag: text-generation |
| | metrics: |
| | - accuracy |
| | model-index: |
| | - name: socpilot-0.5b |
| | results: |
| | - task: |
| | type: text-generation |
| | name: Security Alert Triage |
| | metrics: |
| | - type: accuracy |
| | value: 99.99 |
| | name: Priority Classification |
| | base_model: |
| | - Qwen/Qwen2.5-0.5B-Instruct |
| | --- |
| | # SOCPilot-0.5B: Your AI Copilot for Security Operations |
| |
|
| | **Open-source SIEM-specialized AI model** | **Production ready** |
| |
|
| | [](https://huggingface.co/radherackbank/socpilot-0.5b) |
| | [](LICENSE) |
| | []() |
| |
|
| | ## Overview |
| |
|
| | SOCPilot is the **open-source AI model specialized for SIEM alert triage**. Built on Qwen2.5-0.5B and fine-tuned on 100,000 real security alerts, it automates Security Operations Center workflows with production-grade accuracy. |
| |
|
| | ## Use Cases |
| |
|
| | ### Current (v1.0) |
| | - Automated Suricata IDS/IPS alert triage |
| | - Reduce alert fatigue by 70-80% |
| | - Speed up incident response |
| | - Identify critical threats vs noise |
| | - SOC analyst training and validation |
| |
|
| | ## Basic Usage |
| |
|
| | The model can be loaded using Hugging Face Transformers: |
| | ``` |
| | AutoModelForCausalLM.from_pretrained("radherackbank/socpilot-0.5b") |
| | AutoTokenizer.from_pretrained("radherackbank/socpilot-0.5b") |
| | ``` |
| |
|
| | ## Technical Details |
| |
|
| | ### Model Architecture |
| | - **Base Model:** Qwen2.5-0.5B-Instruct (494M parameters) |
| | - **Fine-tuning Method:** LoRA (Low-Rank Adaptation) |
| | - Rank: 16 |
| | - Alpha: 32 |
| | - Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| | - Trainable parameters: 8.8M (1.78% of total) |
| | - **Quantization:** 4-bit NF4 |
| | - **Precision:** BFloat16 mixed precision |
| | |
| | ### Training Details |
| | - **Dataset:** 100,000 real Suricata eve.json alerts |
| | - **Split:** 90,000 training, 10,000 validation |
| | - **Framework:** HuggingFace Transformers + PEFT |
| | - **Optimizer:** PagedAdamW (8-bit) |
| | |
| | ## Limitations |
| | |
| | - Trained specifically on Suricata eve.json format |
| | - May not generalize to other SIEM formats |
| | - Small model size (0.5B parameters) - larger versions planned |
| | - Requires GPU for optimal performance |
| | - May hallucinate on very rare or novel attack patterns |
| | - Accuracy is reported on an internal validation dataset and should not be interpreted as a guarantee of performance in all environments. |
| | |
| | ## License |
| | |
| | Apache License 2.0 - Free for commercial and research use |
| | |
| | This license includes an explicit grant of patent rights from contributors. |
| | |
| | ## Acknowledgments |
| | |
| | - Built on **Qwen2.5-0.5B-Instruct** (Apache 2.0) |
| | - Fine-tuned on 100K real security alerts |
| | |
| | ## Contributing |
| | |
| | Interested in contributing? We welcome: |
| | - Additional SIEM format support |
| | - Evaluation datasets and benchmarks |
| | - Bug reports and fixes |
| | - Documentation improvements |
| | - Integration examples |
| | |
| | --- |
| | ## Intended Use |
| | |
| | This model is intended to assist SOC analysts with alert triage and prioritization. |
| | It is designed as a decision-support tool. |
| | |
| | ## Not Intended Use |
| | This model is not intended to replace human analysts or perform autonomous incident response. |
| | |
| | **Making SOC operations intelligent, one alert at a time** |
| | |