|
|
--- |
|
|
library_name: peft |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- json-extraction |
|
|
- modernbert |
|
|
- lora |
|
|
- diffuberta |
|
|
metrics: |
|
|
- name: train_loss |
|
|
value: 4.7773 |
|
|
- name: eval_loss |
|
|
value: 4.316555023193359 |
|
|
--- |
|
|
|
|
|
# DiffuBERTa: JSON Extraction Adapter |
|
|
|
|
|
This model is a Fine-tuned version of **answerdotai/ModernBERT-base** using LoRA. It is designed to extract structured JSON data from unstructured text using a parallel decoding approach. |
|
|
|
|
|
## Model Performance |
|
|
- **Final Training Loss**: 4.7773 |
|
|
- **Final Evaluation Loss**: 4.316555023193359 |
|
|
- **Training Epochs**: 5 |
|
|
- **Date Trained**: 2025-11-28 |
|
|
|
|
|
## 🚀 Live Demo Output |
|
|
*(Generated automatically after training)* |
|
|
|
|
|
**Input Text:** |
|
|
> "We are excited to welcome Dr. Sarah to our Paris office as Senior Data Scientist." |
|
|
|
|
|
**Template:** |
|
|
> `{'name': '[1]', 'job': '[2]', 'city': '[1]'}` |
|
|
|
|
|
**Model Output:** |
|
|
```json |
|
|
{ |
|
|
"name": "Sarah", |
|
|
"job": "Data scientist", |
|
|
"city": "Paris" |
|
|
} |
|
|
``` |
|
|
|
|
|
## Usage |
|
|
```python |
|
|
from transformers import AutoModelForMaskedLM, AutoTokenizer |
|
|
from peft import PeftModel |
|
|
|
|
|
base_model = AutoModelForMaskedLM.from_pretrained("answerdotai/ModernBERT-base") |
|
|
model = PeftModel.from_pretrained(base_model, "philipp-zettl/DiffuBERTa") |
|
|
# ... use extract_parallel helper ... |
|
|
``` |
|
|
|