Instructions to use teppei727/bert-large-relation14 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use teppei727/bert-large-relation14 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="teppei727/bert-large-relation14")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("teppei727/bert-large-relation14") model = AutoModelForSequenceClassification.from_pretrained("teppei727/bert-large-relation14") - Notebooks
- Google Colab
- Kaggle
bert-large-relation14
Finetuned BERT model for 14-class classification. It was introduced in the paper: Automatic Slide Generation Using Discourse Relations and first released in this repository. This model is uncased: it does not make a difference between english and English.
In our proposed method in this paper, we only used this model for the classification of discourse relation between the FIRST and SECOND sentence in summarized sentences. The model that is used between the other sentences is this model. If you are curious about our proposed method, it's better to see that model.
Descliption
This model can classify the relation between the sentence pair of input.
Now we are working on preparing the Model card. Please wait for a few days.
The model trained from bert-large-uncased on the dataset published in the paper: Automatic Prediction of Discourse Connectives.
The dataset to make this model is based on English Wikipedia data and has 20 labels. However, this model will classify into 14 labels. This is because the 20-class data set was restructured to 14 classes to suit our research objective of "automatic slide generation. This distribution is shown below.
| Level 1 | Level 2 | Level 3 | Connectives (20) |
|---|---|---|---|
| Temporal | Synchronous | meanwhile | |
| Temporal | Asynchronous | Precedence | then, |
| Temporal | Asynchronous | Precedence | finally, |
| Temporal | Asynchronous | Succession | by then |
| Contingency | Cause | Result | therefore |
| Comparison | Concession | Arg2-as-denier | however, |
| Comparison | Concession | Arg2-as-denier | nevertheless |
| Comparison | Contrast | on the other hand, | |
| Comparison | Contrast | by contrast, | |
| Expansion | Conjunction | and | |
| Expansion | Conjunction | moreover | |
| Expansion | Conjunction | indeed | |
| Expansion | Equivalence | in other words | |
| Expansion | Exception | Arg1-as-excpt | otherwise |
| Expansion | Instantiation | Arg2-as-instance | for example, |
| Expansion | Level-of-detail | Arg1-as-detail | overall, |
| Expansion | Level-of-detail | Arg2-as-detail | in particular, |
| Expansion | Substitution | Arg2-as-subst | instead |
| Expansion | Substitution | Arg2-as-subst | rather |
Training
The model was trained using AutoModelForSequenceClassification.from_pretrained
training_args = TrainingArguments(
output_dir = output_dir,
save_strategy="epoch",
num_train_epochs = 5,
per_device_train_batch_size=16,
per_device_eval_batch_size=32,
warmup_steps=0,
weight_decay=0.01,
logging_dir="./logs",
evaluation_strategy="epoch",
learning_rate=2e-5,
metric_for_best_model="f1",
load_best_model_at_end=True
)
Evaluation (14 labels and original 20 labels classification) using the dataset test split gives:
| Model | Macro F1 | Accuracy | Precision | Recall |
|---|---|---|---|---|
| 14 labels classification | 0.586 | 0.589 | 0.630 | 0.591 |
| 20 labels classification | 0.478 | 0.488 | 0.536 | 0.488 |
- Downloads last month
- 141