Instructions to use hazemkhaled-94/modernlogbert-wce with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use hazemkhaled-94/modernlogbert-wce with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="hazemkhaled-94/modernlogbert-wce")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("hazemkhaled-94/modernlogbert-wce") model = AutoModelForSequenceClassification.from_pretrained("hazemkhaled-94/modernlogbert-wce") - Notebooks
- Google Colab
- Kaggle
ModernLogBERT (WCE)
A ModernBERT encoder
fine-tuned to classify the severity level of a single log line into one of
six levels: TRACE, DEBUG, INFO, WARN, ERROR, FATAL.
This checkpoint is trained with a Weighted Cross-Entropy (WCE) objective. A sibling checkpoint trained with Weighted Generalized Cross-Entropy (a noise-tolerant loss) is at hazemkhaled-94/modernlogbert-gce.
Built with the log-lens project, which also provides the Drain3 preprocessing pipeline these inputs require (see "How to use").
Intended use
- Intended: triage and observability research โ predicting or sanity-checking log severity, and flagging entries whose predicted severity disagrees with the emitted level as candidate anomalies.
- Out of scope: a sole source of truth for alerting or incident severity. Aggregate accuracy hides brittle behavior on unfamiliar log formats โ keep a human in the loop.
How to use
Inputs must be Drain3-masked the same way as in training (variables
replaced by placeholders such as <NUM>, <IP>, <UUID>); raw text degrades
predictions. The log-lens
repo ships a ready-to-use Drain3 preprocessing pipeline that produces exactly
this masked form.
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
repo = "hazemkhaled-94/modernlogbert-wce"
tokenizer = AutoTokenizer.from_pretrained(repo)
model = AutoModelForSequenceClassification.from_pretrained(repo).eval()
text = "Connection refused after <<NUM>> retries" # Drain3-masked input
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
pred = model(**inputs).logits.argmax(-1).item()
print(model.config.id2label[pred])
Training data
- In-distribution (train + held-out eval): a publicly available collection of system log corpora (loghub), preprocessed into a level-balanced, stratified sample.
- Out-of-distribution (evaluation only): a single private industrial Kubernetes log deployment. Not released; used purely as an OOD generalization probe.
Training procedure
| Hyperparameter | Value |
|---|---|
| Backbone | ModernBERT-base |
| Loss | Weighted Cross-Entropy |
| Epochs | 8 |
| Per-device batch size | 32 |
| Gradient accumulation | 4 (effective batch 128) |
| Learning rate | 1e-5 (separate LRs for head vs backbone) |
| Weight decay | 0.01 |
| Warmup ratio | 0.1 |
| Max sequence length | 512 |
| Best-model metric | macro F1 |
Evaluation
In-distribution (held-out stratified slice)
| Metric | Value |
|---|---|
| Accuracy | 88.18% |
| Macro precision | 0.7647 |
| Macro recall | 0.8271 |
| Macro F1 | 0.7813 |
| Weighted F1 | 0.8947 |
On the curated in-distribution slice this WCE checkpoint is the stronger and better-calibrated model (vs the WGCE sibling).
Out-of-distribution
Evaluated on a private industrial Kubernetes domain โ a different log distribution than training. Performance degraded modestly but stayed usable, the expected cost of moving to unfamiliar formats. As always for OOD use, validate on your own log distribution before relying on it.
Limitations and biases
- OOD generalization โ only modest degradation was observed on a single private industrial domain; other distributions are unverified, so validate on your own logs.
- Confidence โ correctness โ treat scores as signals, not guarantees.
- Preprocessing coupling โ inputs must be Drain3-masked exactly as in training (use the log-lens preprocessing pipeline).
- Downloads last month
- 66
Model tree for hazemkhaled-94/modernlogbert-wce
Base model
answerdotai/ModernBERT-baseEvaluation results
- Accuracy (in-distribution)self-reported0.882
- Macro F1 (in-distribution)self-reported0.781