upload model performance
Browse files
README.md
CHANGED
|
@@ -21,28 +21,28 @@ model-index:
|
|
| 21 |
revision: 172e61bb1dd20e43903f4c51e5cbec61ec9ae6e6
|
| 22 |
metrics:
|
| 23 |
- type: accuracy
|
| 24 |
-
value: 0.
|
| 25 |
name: Accuracy 'Bezeichnung'
|
| 26 |
- type: precision
|
| 27 |
-
value: 0.
|
| 28 |
name: Precision 'Bezeichnung' (macro)
|
| 29 |
- type: recall
|
| 30 |
-
value: 0.
|
| 31 |
name: Recall 'Bezeichnung' (macro)
|
| 32 |
- type: f1
|
| 33 |
-
value: 0.
|
| 34 |
name: Recall 'Bezeichnung' (macro)
|
| 35 |
- type: accuracy
|
| 36 |
-
value: 0.
|
| 37 |
name: Accuracy 'Thema'
|
| 38 |
- type: precision
|
| 39 |
-
value: 0.
|
| 40 |
name: Precision 'Thema' (macro)
|
| 41 |
- type: recall
|
| 42 |
-
value: 0.
|
| 43 |
name: Recall 'Thema' (macro)
|
| 44 |
- type: f1
|
| 45 |
-
value: 0.
|
| 46 |
name: Recall 'Thema' (macro)
|
| 47 |
---
|
| 48 |
|
|
@@ -61,7 +61,7 @@ This model is based on bert-base-german-cased and fine-tuned on and-effect/mdk_g
|
|
| 61 |
- **License:** [More Information Needed]
|
| 62 |
- **Finetuned from model:** "bert-base-german-case. For more information one the model check on [this model card](https://huggingface.co/bert-base-german-cased)"
|
| 63 |
|
| 64 |
-
## Model Sources
|
| 65 |
|
| 66 |
<!-- Provide the basic links for the model. -->
|
| 67 |
|
|
@@ -166,8 +166,8 @@ The model is fine tuned with similar and dissimilar pairs. Similar pairs are bui
|
|
| 166 |
|
| 167 |
| pairs | size |
|
| 168 |
|-----|-----|
|
| 169 |
-
| train_similar_pairs |
|
| 170 |
-
| train_unsimilar_pairs |
|
| 171 |
| test_similar_pairs | 498 |
|
| 172 |
| test_unsimilar_pairs | 249 |
|
| 173 |
|
|
@@ -179,13 +179,13 @@ The model was trained with the parameters:
|
|
| 179 |
`torch.utils.data.dataloader.DataLoader`
|
| 180 |
|
| 181 |
**Loss**:
|
| 182 |
-
`sentence_transformers.losses.CosineSimilarityLoss.CosineSimilarityLoss`
|
| 183 |
|
| 184 |
Hyperparameter:
|
| 185 |
```
|
| 186 |
{
|
| 187 |
"epochs": 3,
|
| 188 |
-
"
|
| 189 |
}
|
| 190 |
```
|
| 191 |
|
|
@@ -198,7 +198,7 @@ Hyperparameter:
|
|
| 198 |
|
| 199 |
# Evaluation
|
| 200 |
|
| 201 |
-
All metrices express the models ability to classify dataset titles from GOVDATA into the taxonomy described [here](https://huggingface.co/datasets/and-effect/mdk_gov_data_titles_clf). For more information see VERLINKUNG MDK Projekt.
|
| 202 |
|
| 203 |
## Testing Data, Factors & Metrics
|
| 204 |
|
|
@@ -214,12 +214,12 @@ The model performance is tested with fours metrices. Accuracy, Precision, Recall
|
|
| 214 |
|
| 215 |
| ***task*** | ***acccuracy*** | ***precision (macro)*** | ***recall (macro)*** | ***f1 (macro)*** |
|
| 216 |
|-----|-----|-----|-----|-----|
|
| 217 |
-
| Test dataset 'Bezeichnung' I | 0.
|
| 218 |
-
| Test dataset 'Thema' I | 0.
|
| 219 |
-
| Test dataset 'Bezeichnung' II | 0.
|
| 220 |
| Validation dataset 'Bezeichnung' I | 0.5445544554455446 | 0.41787439613526567 | 0.39929183135704877 | 0.4010173484686228 |
|
| 221 |
| Validation dataset 'Thema' I | 0.801980198019802 | 0.6433080808080808 | 0.7039711632453568 | 0.6591710279769981 |
|
| 222 |
-
| Validation dataset 'Bezeichnung' II | 0.5445544554455446 | 0.
|
| 223 |
|
| 224 |
|
| 225 |
### Summary
|
|
|
|
| 21 |
revision: 172e61bb1dd20e43903f4c51e5cbec61ec9ae6e6
|
| 22 |
metrics:
|
| 23 |
- type: accuracy
|
| 24 |
+
value: 0.6762295081967213
|
| 25 |
name: Accuracy 'Bezeichnung'
|
| 26 |
- type: precision
|
| 27 |
+
value: 0.5688091249507292
|
| 28 |
name: Precision 'Bezeichnung' (macro)
|
| 29 |
- type: recall
|
| 30 |
+
value: 0.5981436148510813
|
| 31 |
name: Recall 'Bezeichnung' (macro)
|
| 32 |
- type: f1
|
| 33 |
+
value: 0.5693466048057273
|
| 34 |
name: Recall 'Bezeichnung' (macro)
|
| 35 |
- type: accuracy
|
| 36 |
+
value: 0.8934426229508197
|
| 37 |
name: Accuracy 'Thema'
|
| 38 |
- type: precision
|
| 39 |
+
value: 0.9258716898716898
|
| 40 |
name: Precision 'Thema' (macro)
|
| 41 |
- type: recall
|
| 42 |
+
value: 0.8669105248121641
|
| 43 |
name: Recall 'Thema' (macro)
|
| 44 |
- type: f1
|
| 45 |
+
value: 0.8632335412054082
|
| 46 |
name: Recall 'Thema' (macro)
|
| 47 |
---
|
| 48 |
|
|
|
|
| 61 |
- **License:** [More Information Needed]
|
| 62 |
- **Finetuned from model:** "bert-base-german-case. For more information one the model check on [this model card](https://huggingface.co/bert-base-german-cased)"
|
| 63 |
|
| 64 |
+
## Model Sources
|
| 65 |
|
| 66 |
<!-- Provide the basic links for the model. -->
|
| 67 |
|
|
|
|
| 166 |
|
| 167 |
| pairs | size |
|
| 168 |
|-----|-----|
|
| 169 |
+
| train_similar_pairs | 1964 |
|
| 170 |
+
| train_unsimilar_pairs | 982 |
|
| 171 |
| test_similar_pairs | 498 |
|
| 172 |
| test_unsimilar_pairs | 249 |
|
| 173 |
|
|
|
|
| 179 |
`torch.utils.data.dataloader.DataLoader`
|
| 180 |
|
| 181 |
**Loss**:
|
| 182 |
+
`sentence_transformers.losses.CosineSimilarityLoss.CosineSimilarityLoss`
|
| 183 |
|
| 184 |
Hyperparameter:
|
| 185 |
```
|
| 186 |
{
|
| 187 |
"epochs": 3,
|
| 188 |
+
"warmup_steps": 100,
|
| 189 |
}
|
| 190 |
```
|
| 191 |
|
|
|
|
| 198 |
|
| 199 |
# Evaluation
|
| 200 |
|
| 201 |
+
All metrices express the models ability to classify dataset titles from GOVDATA into the taxonomy described [here](https://huggingface.co/datasets/and-effect/mdk_gov_data_titles_clf). For more information see VERLINKUNG MDK Projekt.
|
| 202 |
|
| 203 |
## Testing Data, Factors & Metrics
|
| 204 |
|
|
|
|
| 214 |
|
| 215 |
| ***task*** | ***acccuracy*** | ***precision (macro)*** | ***recall (macro)*** | ***f1 (macro)*** |
|
| 216 |
|-----|-----|-----|-----|-----|
|
| 217 |
+
| Test dataset 'Bezeichnung' I | 0.6762295081967213 | 0.5688091249507292 | 0.5981436148510813 | 0.5693466048057273 |
|
| 218 |
+
| Test dataset 'Thema' I | 0.8934426229508197 | 0.9258716898716898 | 0.8669105248121641 | 0.8632335412054082 |
|
| 219 |
+
| Test dataset 'Bezeichnung' II | 0.6762295081967213 | 0.5598761408083442 | 0.7875393612235718 | 0.6306226331603018 |
|
| 220 |
| Validation dataset 'Bezeichnung' I | 0.5445544554455446 | 0.41787439613526567 | 0.39929183135704877 | 0.4010173484686228 |
|
| 221 |
| Validation dataset 'Thema' I | 0.801980198019802 | 0.6433080808080808 | 0.7039711632453568 | 0.6591710279769981 |
|
| 222 |
+
| Validation dataset 'Bezeichnung' II | 0.5445544554455446 | 0.6018518518518517 | 0.6278409090909091 | 0.6066776135741653 |
|
| 223 |
|
| 224 |
|
| 225 |
### Summary
|