| | --- |
| | language: en |
| | license: mit |
| | library_name: transformers |
| | pipeline_tag: token-classification |
| | tags: |
| | - deberta |
| | - ner |
| | - token-classification |
| | - cybersecurity |
| | - logs |
| | base_model: microsoft/deberta-v3-base |
| | --- |
| | |
| | # DeBERTa-v3 Log Entity Recognition Model |
| |
|
| | Fine-tuned DeBERTa-v3-small for Named Entity Recognition on system and cloud logs. |
| |
|
| | ## Model Details |
| | - **Base Model**: microsoft/deberta-v3-base |
| | - **Training Data**: 7003 synthetic + real logs |
| | - **Validation F1**: Check evaluation_results.txt |
| | |
| | ## Entities |
| | ['O', 'B-SERVICE', 'I-SERVICE', 'B-ERROR', 'I-ERROR', 'B-HOST', 'I-HOST', 'B-PROCESS', 'I-PROCESS'] |
| | |
| | ## Usage |
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForTokenClassification |
| | from peft import PeftModel |
| | |
| | model_id = "YOUR_USERNAME/log-ner-deberta-lora" |
| | |
| | tokenizer = AutoTokenizer.from_pretrained(model_id) |
| | base_model = AutoModelForTokenClassification.from_pretrained("microsoft/deberta-v3-base") |
| | model = PeftModel.from_pretrained(base_model, model_id) |
| |
|
| | # Extract entities |
| | text = "nginx timeout on server1" |
| | inputs = tokenizer(text, return_tensors="pt") |
| | outputs = model(**inputs) |
| | ``` |
| | |
| | ## Training Configuration |
| | - LoRA rank: 32 |
| | - Training epochs: 15 |
| | - Learning rate: 0.0003 |
| | |