philipp-zettl commited on
Commit
928466f
·
verified ·
1 Parent(s): 6f57c41

Training in progress, epoch 1

Browse files
README.md CHANGED
@@ -1,73 +1,67 @@
1
  ---
2
  library_name: peft
3
  license: apache-2.0
 
4
  tags:
5
- - json-extraction
6
- - modernbert
7
  - lora
8
- - diffuberta
9
- language: en
10
- metrics:
11
- - name: train_loss
12
- value: 4.7773
13
- - name: eval_loss
14
- value: 4.318033695220947
15
- datasets:
16
- - generated-json-pairs
17
  ---
18
 
19
- ---
20
- datasets:
21
- - generated-json-pairs
22
- language: en
23
- library_name: peft
24
- license: apache-2.0
25
- metrics:
26
- - name: train_loss
27
- value: 4.7773
28
- - name: eval_loss
29
- value: 4.318033695220947
30
- tags:
31
- - json-extraction
32
- - modernbert
33
- - lora
34
- - diffuberta
35
- ---
 
36
 
37
- # DiffuBERTa: JSON Extraction Adapter
38
 
39
- This model is a Fine-tuned version of **answerdotai/ModernBERT-base** using LoRA. It is designed to extract structured JSON data from unstructured text using a parallel decoding approach.
40
 
41
- ## Model Performance
42
- - **Final Training Loss**: 4.7773
43
- - **Final Evaluation Loss**: 4.318033695220947
44
- - **Training Epochs**: 5
45
- - **Date Trained**: 2025-11-28
46
 
47
- ## 🚀 Live Demo Output
48
- *(Generated automatically after training)*
 
 
 
 
 
 
 
 
 
49
 
50
- **Input Text:**
51
- > "We are excited to welcome Dr. Sarah to our Paris office as Senior Data Scientist."
52
 
53
- **Template:**
54
- > `{'name': '[1]', 'job': '[2]', 'city': '[1]'}`
 
 
 
 
 
55
 
56
- **Model Output:**
57
- ```json
58
- {
59
- "name": "Sarah",
60
- "job": "Data scientist",
61
- "city": "Paris"
62
- }
63
- ```
64
 
65
- ## Usage
66
- ```python
67
- from transformers import AutoModelForMaskedLM, AutoTokenizer
68
- from peft import PeftModel
69
 
70
- base_model = AutoModelForMaskedLM.from_pretrained("answerdotai/ModernBERT-base")
71
- model = PeftModel.from_pretrained(base_model, "philipp-zettl/DiffuBERTa")
72
- # ... use extract_parallel helper ...
73
- ```
 
1
  ---
2
  library_name: peft
3
  license: apache-2.0
4
+ base_model: answerdotai/ModernBERT-base
5
  tags:
6
+ - base_model:adapter:answerdotai/ModernBERT-base
 
7
  - lora
8
+ - transformers
9
+ model-index:
10
+ - name: DiffuBERTa
11
+ results: []
 
 
 
 
 
12
  ---
13
 
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # DiffuBERTa
18
+
19
+ This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on an unknown dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 4.3180
22
+
23
+ ## Model description
24
+
25
+ More information needed
26
+
27
+ ## Intended uses & limitations
28
+
29
+ More information needed
30
+
31
+ ## Training and evaluation data
32
 
33
+ More information needed
34
 
35
+ ## Training procedure
36
 
37
+ ### Training hyperparameters
 
 
 
 
38
 
39
+ The following hyperparameters were used during training:
40
+ - learning_rate: 3e-05
41
+ - train_batch_size: 8
42
+ - eval_batch_size: 8
43
+ - seed: 42
44
+ - gradient_accumulation_steps: 2
45
+ - total_train_batch_size: 16
46
+ - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
47
+ - lr_scheduler_type: linear
48
+ - lr_scheduler_warmup_steps: 500
49
+ - num_epochs: 5
50
 
51
+ ### Training results
 
52
 
53
+ | Training Loss | Epoch | Step | Validation Loss |
54
+ |:-------------:|:-----:|:----:|:---------------:|
55
+ | 15.3944 | 1.0 | 63 | 14.6510 |
56
+ | 13.7016 | 2.0 | 126 | 10.3954 |
57
+ | 10.2371 | 3.0 | 189 | 6.2723 |
58
+ | 5.5815 | 4.0 | 252 | 5.0812 |
59
+ | 4.7773 | 5.0 | 315 | 4.3180 |
60
 
 
 
 
 
 
 
 
 
61
 
62
+ ### Framework versions
 
 
 
63
 
64
+ - PEFT 0.18.0
65
+ - Transformers 4.57.3
66
+ - Pytorch 2.9.1+cu128
67
+ - Tokenizers 0.22.1
adapter_config.json CHANGED
@@ -29,10 +29,10 @@
29
  "rank_pattern": {},
30
  "revision": null,
31
  "target_modules": [
32
- "W1",
33
- "Wqkv",
34
  "Wo",
35
- "W2"
 
 
36
  ],
37
  "target_parameters": null,
38
  "task_type": "FEATURE_EXTRACTION",
 
29
  "rank_pattern": {},
30
  "revision": null,
31
  "target_modules": [
 
 
32
  "Wo",
33
+ "W2",
34
+ "W1",
35
+ "Wqkv"
36
  ],
37
  "target_parameters": null,
38
  "task_type": "FEATURE_EXTRACTION",
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:27b153a25bf3d5e38441ccca5c52e8455ab6d353be028e0ebe8b95dab0073673
3
  size 9207688
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e0e49115972bdf0ea99a58aceac7ead32207c21ef5f8367bd433253d2eb1553
3
  size 9207688
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8a1676b1da10f7730e9075290eef1895a2e69f3e8726e3b9cda4f4f068be0c40
3
  size 5905
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:60dd10c2a863cb1987768c6c8035956b4d7d4bdee636f5e9eeba2b53a47c0cef
3
  size 5905