|
|
--- |
|
|
library_name: transformers |
|
|
license: apache-2.0 |
|
|
base_model: bert-base-uncased |
|
|
tags: |
|
|
- generated_from_keras_callback |
|
|
model-index: |
|
|
- name: Mhammad2023/snli-bert-base-uncased |
|
|
results: [] |
|
|
--- |
|
|
# SNLI BERT Base Uncased |
|
|
|
|
|
<!-- This model is a fine-tuned **BERT-base-uncased** transformer on the **Stanford Natural Language Inference (SNLI)** dataset. It performs **natural language inference** (NLI), also known as recognizing textual entailment (RTE), which involves classifying whether a *hypothesis* sentence is entailed by, contradicts, or is neutral with respect to a *premise* sentence. --> |
|
|
|
|
|
# Mhammad2023/snli-bert-base-uncased |
|
|
|
|
|
This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on an the **Stanford Natural Language Inference (SNLI)** dataset. |
|
|
It achieves the following results on the evaluation set: |
|
|
- Train Loss: 0.3830 |
|
|
- Train Accuracy: 0.8599 |
|
|
- Validation Loss: 0.3341 |
|
|
- Validation Accuracy: 0.8746 |
|
|
- Epoch: 2 |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Architecture:** BERT-base (uncased) |
|
|
- **Dataset:** Stanford Natural Language Inference (SNLI) |
|
|
- **Labels:** |
|
|
- `entailment` (0) |
|
|
- `neutral` (1) |
|
|
- `contradiction` (2) |
|
|
- **Framework:** PyTorch |
|
|
- **Tokenizer:** `bert-base-uncased` |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
This model can be used for: |
|
|
|
|
|
- Textual entailment tasks |
|
|
- Sentence pair classification |
|
|
- Natural language understanding tasks requiring inference |
|
|
|
|
|
## Limitations and Biases |
|
|
The model inherits any biases present in the original SNLI dataset. |
|
|
|
|
|
It may not generalize well to domains or sentence pairs that are significantly different from the SNLI training data. |
|
|
|
|
|
Performance may degrade on noisy or complex linguistic inputs. |
|
|
|
|
|
## Training Data |
|
|
The model is fine-tuned on the Stanford Natural Language Inference (SNLI) dataset: |
|
|
|
|
|
SNLI Dataset: https://huggingface.co/datasets/stanfordnlp/snli |
|
|
|
|
|
## Model description |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Intended uses & limitations |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Training and evaluation data |
|
|
|
|
|
More information needed |
|
|
|
|
|
## Training procedure |
|
|
|
|
|
### Training hyperparameters |
|
|
|
|
|
The following hyperparameters were used during training: |
|
|
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'module': 'transformers.optimization_tf', 'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-05, 'decay_schedule_fn': {'module': 'keras.optimizers.schedules', 'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-05, 'decay_steps': 13500, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'registered_name': None}, 'warmup_steps': 1500, 'power': 1.0, 'name': None}, 'registered_name': 'WarmUp'}, 'decay': 0.0, 'beta_1': np.float32(0.9), 'beta_2': np.float32(0.999), 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01} |
|
|
- training_precision: float32 |
|
|
|
|
|
### Training results |
|
|
|
|
|
| Train Loss | Train Accuracy | Validation Loss | Validation Accuracy | Epoch | |
|
|
|:----------:|:--------------:|:---------------:|:-------------------:|:-----:| |
|
|
| 0.6102 | 0.7444 | 0.4452 | 0.8295 | 0 | |
|
|
| 0.4504 | 0.8280 | 0.3723 | 0.8600 | 1 | |
|
|
| 0.3830 | 0.8599 | 0.3341 | 0.8746 | 2 | |
|
|
|
|
|
|
|
|
### Framework versions |
|
|
|
|
|
- Transformers 4.52.2 |
|
|
- TensorFlow 2.18.0 |
|
|
- Datasets 3.6.0 |
|
|
- Tokenizers 0.21.1 |
|
|
|
|
|
## How to Use |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch |
|
|
import torch.nn.functional as F |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("Mhammad2023/snli-bert-base-uncased") |
|
|
model = AutoModelForSequenceClassification.from_pretrained("Mhammad2023/snli-bert-base-uncased") |
|
|
|
|
|
premise = "A man inspects the uniform of a figure in some East Asian country." |
|
|
hypothesis = "The man is sleeping." |
|
|
|
|
|
inputs = tokenizer(premise, hypothesis, return_tensors="pt") |
|
|
outputs = model(**inputs) |
|
|
probs = F.softmax(outputs.logits, dim=1) |
|
|
predicted_class = torch.argmax(probs).item() |
|
|
|
|
|
label_map = {0: "entailment", 1: "neutral", 2: "contradiction"} |
|
|
print(f"Prediction: {label_map[predicted_class]} with confidence {probs[0][predicted_class].item():.4f}") |
|
|
``` |
|
|
|