BERT Fine-Tuned on Winograd NLI

A fine-tuned BERT model using the Winograd NLI dataset.

Model Details

Description

This model is based on the BERT base (uncased) architecture and has been fine-tuned on the Winograd NLI dataset.

Developed by: Cesar Gonzalez-Gutierrez
Funded by: ERC
Architecture: BERT-base
Base model: BERT base model (uncased)
Language: English
License: Apache 2.0

Seed Initializations

Alternative models trained using different initialization seeds are available and can be accessed using specific branches:

Random Seed	Branch
120	seed-120
220	seed-220
320	seed-320
420	seed-420
520	seed-520

To load a model from a specific branch, use the revision parameter:

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("<model>", revision="seed-120")

Sources

[Information pending]

Training Details

Fine-tuning was performed end-to-end using a grid search over key hyperparameters. Model performance was evaluated based on validation loss computed on the development set. After identifying the optimal hyperparameter configuration, the final model was retrained on the entire training dataset.

Training Data

The model was trained on the Winograd NLI training split. Validation was performed using a random 20% subset of this training data. The original validation split was used as the test set, since the original test split does not include labels.

Training Hyperparameters

Epochs: 1-4
Batch size: {16, 32}
Learning rate: {5e-5, 3e-5, 2e-5}
Validation metric: loss
Precision: fp16

Uses

This model can be used for classification tasks aligned with the structure and intent of the Winograd NLI dataset.

For broader guidance, refer to the BERT base model’s Inteded Uses & Limitations.

Bias, Risks, and Limitations

This model inherits the potential risks and limitations of its base model. For more details, refer to the Limitations and bias section of the original model documentation.

Additionally, it may reflect or amplify patterns and biases present in the Winograd NLI training data.

Hardware

Hardware Type: NVIDIA Tesla V100 PCIE 32GB
Cluster Provider: Artemisa
Compute Region: EU

Citation

If you use this model in your research, please cite both the base BERT model and the Winograd NLI source.

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for cglez/bert-base-uncased-ft-winograd_nli

Base model

google-bert/bert-base-uncased

Finetuned

(6615)

this model

cglez
/

bert-base-uncased-ft-winograd_nli