kishankachhadiya's picture
Update README.md (#1)
45499be verified
metadata
library_name: transformers
license: mit
base_model: microsoft/deberta-v3-large
tags:
  - generated_from_trainer
metrics:
  - accuracy
  - f1
model-index:
  - name: kishankachhadiya/debarta-text-classifier
    results:
      - task:
          type: text-classification
          name: Human vs AI Text Classification
        dataset:
          name: custom_text_dataset
          type: custom
          split: test
          size: 10050
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.998
          - name: Precision
            type: precision
            value: 0.998
          - name: Recall
            type: recall
            value: 0.998
          - name: F1 Score
            type: f1
            value: 0.998
          - name: Validation Loss (Best)
            type: loss
            value: 0.004071
          - name: Step (Best Checkpoint)
            type: step
            value: 5000

๐Ÿง  AI Text Detector โ€“ DeBERTa v3 Large (Fine-tuned on Human vs AI Text)

This model is fine-tuned on a labeled dataset for AI-generated vs. Human-written text detection.


โš™๏ธ Model Details

  • Base Model: microsoft/deberta-v3-large
  • Fine-tuned by: @kishan
  • Epochs: 4
  • Learning Rate: 2e-05
  • Batch Size: 31
  • GPU: 80 GB A100
  • Optimizer: AdamW (Fused)
  • Scheduler: Cosine
  • Mixed Precision: FP16
  • Gradient Checkpointing: Enabled

๐Ÿ“Š Evaluation Results (Test Set)

Metric Score
Accuracy 0.998
Human (0) โ€“ Precision 0.999
Human (0) โ€“ Recall 0.997
AI (1) โ€“ Precision 0.997
AI (1) โ€“ Recall 0.999

๐Ÿงฎ Confusion Matrix

Confusion Matrix


๐Ÿš€ Example Inference

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("kishankachhadiya/debarta-text-classifier")
model = AutoModelForSequenceClassification.from_pretrained("kishankachhadiya/debarta-text-classifier")

text = "This text was likely written by an AI model."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
probs = torch.nn.functional.softmax(outputs.logits, dim=1)
print(probs)