🚀 HDFS Failure Prediction Model

Developed by: Shashank Choudhary (@Sha09090)
License: MIT

This model detects failures in HDFS logs with 99.95% Accuracy.
It was trained on a balanced dataset of ~575k log entries.

📊 Benchmarks

Metric	Score
Accuracy	99.95%
Precision	99.99%
Recall	99.96%

💻 How to Use

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load directly from Shashank's Repo
model_name = "Sha09090/hdfs-failure-prediction"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Predict
log = "PacketResponder: error for block blk_12345 terminating"
inputs = tokenizer(log, return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits
    
print("Failure Probability:", torch.softmax(logits, dim=1)[0][1].item())

@misc{hdfs-failure-prediction,
  author = {Shashank Choudhary},
  title = {HDFS Failure Prediction Transformer},
  year = {2026},
  publisher = {Hugging Face},
  journal = {Hugging Face Repository},
  howpublished = {\url{[https://huggingface.co/Sha09090/hdfs-failure-prediction](https://huggingface.co/Sha09090/hdfs-failure-prediction)}}
}

Downloads last month: 4

Safetensors

Model size

67M params

Tensor type

F32