Mhammad2023's picture
update README.md
b5ed983 verified
metadata
library_name: transformers
license: apache-2.0
base_model: bert-base-uncased
tags:
  - generated_from_keras_callback
model-index:
  - name: Mhammad2023/snli-bert-base-uncased
    results: []

SNLI BERT Base Uncased

Mhammad2023/snli-bert-base-uncased

This model is a fine-tuned version of bert-base-uncased on an the Stanford Natural Language Inference (SNLI) dataset. It achieves the following results on the evaluation set:

  • Train Loss: 0.3830
  • Train Accuracy: 0.8599
  • Validation Loss: 0.3341
  • Validation Accuracy: 0.8746
  • Epoch: 2

Model Details

  • Architecture: BERT-base (uncased)
  • Dataset: Stanford Natural Language Inference (SNLI)
  • Labels:
    • entailment (0)
    • neutral (1)
    • contradiction (2)
  • Framework: PyTorch
  • Tokenizer: bert-base-uncased

Intended Use

This model can be used for:

  • Textual entailment tasks
  • Sentence pair classification
  • Natural language understanding tasks requiring inference

Limitations and Biases

The model inherits any biases present in the original SNLI dataset.

It may not generalize well to domains or sentence pairs that are significantly different from the SNLI training data.

Performance may degrade on noisy or complex linguistic inputs.

Training Data

The model is fine-tuned on the Stanford Natural Language Inference (SNLI) dataset:

SNLI Dataset: https://huggingface.co/datasets/stanfordnlp/snli

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'transformers.optimization_tf', 'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-05, 'decay_schedule_fn': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': 13500, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'warmup_steps': 1500, 'power': 1.0, 'name': None}, 'registered_name': 'WarmUp'}, 'decay': 0.0, 'beta_1': np.float32(0.9), 'beta_2': np.float32(0.999), 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
  • training_precision: float32

Training results

Train Loss Train Accuracy Validation Loss Validation Accuracy Epoch
0.6102 0.7444 0.4452 0.8295 0
0.4504 0.8280 0.3723 0.8600 1
0.3830 0.8599 0.3341 0.8746 2

Framework versions

  • Transformers 4.52.2
  • TensorFlow 2.18.0
  • Datasets 3.6.0
  • Tokenizers 0.21.1

How to Use

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import torch.nn.functional as F

tokenizer = AutoTokenizer.from_pretrained("Mhammad2023/snli-bert-base-uncased")
model = AutoModelForSequenceClassification.from_pretrained("Mhammad2023/snli-bert-base-uncased")

premise = "A man inspects the uniform of a figure in some East Asian country."
hypothesis = "The man is sleeping."

inputs = tokenizer(premise, hypothesis, return_tensors="pt")
outputs = model(**inputs)
probs = F.softmax(outputs.logits, dim=1)
predicted_class = torch.argmax(probs).item()

label_map = {0: "entailment", 1: "neutral", 2: "contradiction"}
print(f"Prediction: {label_map[predicted_class]} with confidence {probs[0][predicted_class].item():.4f}")