teppei727's picture
Update README.md
274449a
|
raw
history blame
4.1 kB
---
language:
- en
metrics:
- accuracy
- precision
- recall
- f1
library_name: transformers
pipeline_tag: text-classification
---
# bert-large-relation14
Finetuned BERT model for 14-class classification. It was introduced in the paper: [Automatic Slide Generation Using Discourse Relations](https://link.springer.com/chapter/10.1007/978-3-031-36336-8_61) and first released in this repository. This model is uncased: it does not make a difference between english and English.
In our proposed method in this [paper](https://link.springer.com/chapter/10.1007/978-3-031-36336-8_61), we only used this model for the classification of discourse relation between the FIRST and SECOND sentence in summarized sentences. The model that is used between the other sentences is [this model](https://huggingface.co/teppei727/bert_woco). If you are curious about our proposed method, it's better to see that model.
# Descliption
This model can classify the relation between the sentence pair of input.
Now we are working on preparing the Model card. Please wait for a few days.
The model trained from [bert-large-uncased](https://huggingface.co/bert-large-uncased) on the dataset published in the paper: [Automatic Prediction of Discourse Connectives](https://arxiv.org/abs/1702.00992).
The dataset to make this model is based on English Wikipedia data and has 20 labels. However, this model will classify into 14 labels. This is because the 20-class data set was restructured to 14 classes to suit our research objective of "automatic slide generation. This distribution is shown below.
|Level 1|Level 2|Level 3|Connectives (20)|
|-------------|-----------------|------------------|--------------------|
| Temporal | Synchronous | | meanwhile |
| Temporal | Asynchronous | Precedence | then, |
| Temporal | Asynchronous | Precedence | finally, |
| Temporal | Asynchronous | Succession | by then |
| Contingency | Cause | Result | therefore |
| Comparison | Concession | Arg2-as-denier | however, |
| Comparison | Concession | Arg2-as-denier | nevertheless |
| Comparison | Contrast | | on the other hand, |
| Comparison | Contrast | | by contrast, |
| Expansion | Conjunction | | and |
| Expansion | Conjunction | | moreover |
| Expansion | Conjunction | | indeed |
| Expansion | Equivalence | | in other words |
| Expansion | Exception | Arg1-as-excpt | otherwise |
| Expansion | Instantiation | Arg2-as-instance | for example, |
| Expansion | Level-of-detail | Arg1-as-detail | overall, |
| Expansion | Level-of-detail | Arg2-as-detail | in particular, |
| Expansion | Substitution | Arg2-as-subst | instead |
| Expansion | Substitution | Arg2-as-subst | rather |
# Training
The model was trained using AutoModelForSequenceClassification.from_pretrained
```
training_args = TrainingArguments(
output_dir = output_dir,
save_strategy="epoch",
num_train_epochs = 5,
per_device_train_batch_size=16,
per_device_eval_batch_size=32,
warmup_steps=0,
weight_decay=0.01,
logging_dir="./logs",
evaluation_strategy="epoch",
learning_rate=2e-5,
metric_for_best_model="f1",
load_best_model_at_end=True
)
```
# Evaluation (14 labels and original 20 labels classification) using the dataset test split gives:
| Model | Macro F1 | Accuracy | Precision | Recall |
|--------------------------|-----------------|-----------------|------------------|---------------|
| 14 labels classification | 0.586 | 0.589 | 0.630 | 0.591 |
| 20 labels classification | 0.478 | 0.488 | 0.536 | 0.488 |