--- license: apache-2.0 language: - en library_name: transformers tags: - cybersecurity - soc-automation - siem - suricata - security-operations - threat-detection - alert-triage - lora - qwen2.5 pipeline_tag: text-generation metrics: - accuracy model-index: - name: socpilot-0.5b results: - task: type: text-generation name: Security Alert Triage metrics: - type: accuracy value: 99.99 name: Priority Classification base_model: - Qwen/Qwen2.5-0.5B-Instruct --- # SOCPilot-0.5B: Your AI Copilot for Security Operations **Open-source SIEM-specialized AI model** | **Production ready** [![Model](https://img.shields.io/badge/🤗-Model-yellow)](https://huggingface.co/radherackbank/socpilot-0.5b) [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) [![Accuracy](https://img.shields.io/badge/accuracy-99.99%25-green)]() ## Overview SOCPilot is the **open-source AI model specialized for SIEM alert triage**. Built on Qwen2.5-0.5B and fine-tuned on 100,000 real security alerts, it automates Security Operations Center workflows with production-grade accuracy. ## Use Cases ### Current (v1.0) - Automated Suricata IDS/IPS alert triage - Reduce alert fatigue by 70-80% - Speed up incident response - Identify critical threats vs noise - SOC analyst training and validation ## Basic Usage The model can be loaded using Hugging Face Transformers: ``` AutoModelForCausalLM.from_pretrained("radherackbank/socpilot-0.5b") AutoTokenizer.from_pretrained("radherackbank/socpilot-0.5b") ``` ## Technical Details ### Model Architecture - **Base Model:** Qwen2.5-0.5B-Instruct (494M parameters) - **Fine-tuning Method:** LoRA (Low-Rank Adaptation) - Rank: 16 - Alpha: 32 - Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj - Trainable parameters: 8.8M (1.78% of total) - **Quantization:** 4-bit NF4 - **Precision:** BFloat16 mixed precision ### Training Details - **Dataset:** 100,000 real Suricata eve.json alerts - **Split:** 90,000 training, 10,000 validation - **Framework:** HuggingFace Transformers + PEFT - **Optimizer:** PagedAdamW (8-bit) ## Limitations - Trained specifically on Suricata eve.json format - May not generalize to other SIEM formats - Small model size (0.5B parameters) - larger versions planned - Requires GPU for optimal performance - May hallucinate on very rare or novel attack patterns - Accuracy is reported on an internal validation dataset and should not be interpreted as a guarantee of performance in all environments. ## License Apache License 2.0 - Free for commercial and research use This license includes an explicit grant of patent rights from contributors. ## Acknowledgments - Built on **Qwen2.5-0.5B-Instruct** (Apache 2.0) - Fine-tuned on 100K real security alerts ## Contributing Interested in contributing? We welcome: - Additional SIEM format support - Evaluation datasets and benchmarks - Bug reports and fixes - Documentation improvements - Integration examples --- ## Intended Use This model is intended to assist SOC analysts with alert triage and prioritization. It is designed as a decision-support tool. ## Not Intended Use This model is not intended to replace human analysts or perform autonomous incident response. **Making SOC operations intelligent, one alert at a time**