--- language: en license: mit tags: - cybersecurity - binary-classification - pytorch datasets: - custom metrics: - accuracy - auc - precision - recall --- # natSecLabse ## Model Description Binary classification model for cybersecurity threat detection. The model uses a deep neural network to classify text embeddings as cyber-related or non-cyber content. ## Model Architecture - **Input**: 768-dimensional embeddings (e.g., from Gemma) - **Hidden Layers**: 512 → 256 → 128 neurons - **Output**: 1 (binary classification with sigmoid activation) - **Normalization**: LayerNorm + BatchNorm - **Activation**: ReLU - **Total Parameters**: ~557,184 ## Performance Metrics - **Accuracy**: 0.8835 - **Precision**: 0.5713 - **Recall**: 0.8645 - **AUC**: 0.9482 - **F1 Score**: 0.6880 ## Usage ```python import torch from huggingface_hub import hf_hub_download # Download model model_path = hf_hub_download( repo_id="kristiangnordby/natSecLabse", filename="model.pt" ) # Load model checkpoint = torch.load(model_path, map_location='cpu') # For inference, you'll need the model class definition # See model_architecture.py in this repo ``` ## Training Data - Training set: ~166K samples - Validation set: ~25K samples - Test set: ~41K samples - Class distribution: ~18% cyber-related, ~82% non-cyber ## Intended Use This model is designed for: - Cybersecurity content detection - Filtering cyber-related articles/documents - Security threat classification ## Limitations - Requires pre-computed embeddings as input - Trained on specific corpus - may need fine-tuning for other domains - Performance depends on quality of input embeddings ## Training Details - **Optimizer**: Adam (lr=0.001, β₁=0.9, β₂=0.999) - **Loss Function**: Binary Cross-Entropy - **Batch Size**: 512 - **Early Stopping**: Patience of 15 epochs - **Learning Rate Scheduling**: ReduceLROnPlateau (factor=0.5, patience=5) ## Citation If you use this model, please cite: ```bibtex @misc{cybersecurity_classifier, author = {Kristian Nordby}, title = {Cybersecurity Binary Classifier}, year = {2025}, publisher = {HuggingFace}, howpublished = {\url{https://huggingface.co/kristiangnordby/natSecLabse}} } ```