DiffuBERTa / README.md
philipp-zettl's picture
Update README.md
6de70bb verified
metadata
library_name: peft
license: apache-2.0
tags:
  - json-extraction
  - modernbert
  - lora
  - diffuberta
metrics:
  - name: train_loss
    value: 4.7773
  - name: eval_loss
    value: 4.316555023193359

DiffuBERTa: JSON Extraction Adapter

This model is a Fine-tuned version of answerdotai/ModernBERT-base using LoRA. It is designed to extract structured JSON data from unstructured text using a parallel decoding approach.

Model Performance

  • Final Training Loss: 4.7773
  • Final Evaluation Loss: 4.316555023193359
  • Training Epochs: 5
  • Date Trained: 2025-11-28

๐Ÿš€ Live Demo Output

(Generated automatically after training)

Input Text:

"We are excited to welcome Dr. Sarah to our Paris office as Senior Data Scientist."

Template:

{'name': '[1]', 'job': '[2]', 'city': '[1]'}

Model Output:

{
  "name": "Sarah",
  "job": "Data scientist",
  "city": "Paris"
}

Usage

from transformers import AutoModelForMaskedLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForMaskedLM.from_pretrained("answerdotai/ModernBERT-base")
model = PeftModel.from_pretrained(base_model, "philipp-zettl/DiffuBERTa")
# ... use extract_parallel helper ...