DiffuBERTa / README.md

philipp-zettl

Update README.md

6de70bb verified about 2 months ago

preview code

raw

history blame contribute delete

1.22 kB

metadata

library_name: peft
license: apache-2.0
tags:
  - json-extraction
  - modernbert
  - lora
  - diffuberta
metrics:
  - name: train_loss
    value: 4.7773
  - name: eval_loss
    value: 4.316555023193359

DiffuBERTa: JSON Extraction Adapter

This model is a Fine-tuned version of answerdotai/ModernBERT-base using LoRA. It is designed to extract structured JSON data from unstructured text using a parallel decoding approach.

Model Performance

Final Training Loss: 4.7773
Final Evaluation Loss: 4.316555023193359
Training Epochs: 5
Date Trained: 2025-11-28

🚀 Live Demo Output

(Generated automatically after training)

Input Text:

"We are excited to welcome Dr. Sarah to our Paris office as Senior Data Scientist."

Template:

{'name': '[1]', 'job': '[2]', 'city': '[1]'}

Model Output:

{
  "name": "Sarah",
  "job": "Data scientist",
  "city": "Paris"
}

Usage

from transformers import AutoModelForMaskedLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForMaskedLM.from_pretrained("answerdotai/ModernBERT-base")
model = PeftModel.from_pretrained(base_model, "philipp-zettl/DiffuBERTa")
# ... use extract_parallel helper ...