distilbert_nli / README.md

Update README.md

d5e4d46 verified 6 months ago

5.23 kB

	---
	datasets:
	- pietrolesci/gpt3_nli
	language:
	- en
	metrics:
	- accuracy
	base_model:
	- distilbert/distilbert-base-uncased
	library_name: transformers
	tags:
	- nli
	- textclassification
	- distilbert
	pipeline_tag: text-classification
	---
	# Model Card for Model ID

	This is a Natural Language Inference (NLI) model built by fine-tuning DistilBERT-base-uncased on the GPT-3 NLI dataset. The model performs textual entailment classification - given two pieces of text (a premise and a hypothesis), it determines the logical relationship between them.


	## Model Details

	### Model Description

	What it does:

	Takes two text inputs: a premise (text_a) and a hypothesis (text_b)

	Classifies their relationship into one of three categories:

	Entailment: The hypothesis logically follows from the premise

	Neutral: The hypothesis is neither supported nor contradicted by the premise

	Contradiction: The hypothesis contradicts the premise

	Use Cases:

	- Reading comprehension tasks

	- Logical reasoning applications

	- Question-answering systems

	- Text coherence analysis

	- Information verification tasks

	Architecture: DistilBERT-based sequence classification model with 3 output classes, optimized for efficiency while maintaining strong performance on natural language understanding tasks.

	This type of model is fundamental for applications requiring understanding of logical relationships between text passages, such as fact-checking, automated reasoning, and reading comprehension systems.


	## How to Get Started with the Model

	``` python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	# Load the model and tokenizer
	model_name = "gulupgulup/distilbert_nli"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSequenceClassification.from_pretrained(model_name)
	```

	### Usage Example

	``` python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	# Load the model and tokenizer
	model_name = "gulupgulup/distilbert_nli"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForSequenceClassification.from_pretrained(model_name)

	# Example premise and hypothesis
	premise = "A person is riding a bicycle in the park."
	hypothesis = "Someone is exercising outdoors."

	# Tokenize the input
	inputs = tokenizer(premise, hypothesis, return_tensors="pt", truncation=True, padding=True)

	# Make prediction
	with torch.no_grad():
	outputs = model(**inputs)
	predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
	predicted_class = torch.argmax(predictions, dim=-1)

	# Get the predicted label
	id2label = {0: "entailment", 1: "neutral", 2: "contradiction"}
	predicted_label = id2label[predicted_class.item()]

	print(f"Premise: {premise}")
	print(f"Hypothesis: {hypothesis}")
	print(f"Predicted relationship: {predicted_label}")
	print(f"Confidence scores: {predictions.squeeze().tolist()}")
	```

	## Training Details

	### Training Data

	Dataset: \href{https://huggingface.co/datasets/pietrolesci/gpt3_nli}{pietrolesci/gpt3_nli} - A natural language inference dataset containing premise-hypothesis pairs with three-class labels (entailment, neutral, contradiction). The dataset consists of text pairs (text_a and text_b) where the model learns to determine the logical relationship between the premise and hypothesis.ed]


	### Training Procedure

	Base Model: DistilBERT-base-uncased fine-tuned for sequence classification with 3 output labels for natural language inference.

	Training Framework: Hugging Face Transformers Trainer with Weights & Biases (wandb) integration for experiment tracking.

	Data Split: The original training set was split into train (81%), validation (9%), and test (10%) sets using stratified sampling to maintain label distribution balance across splits.


	#### Preprocessing [optional]

	Text pairs are tokenized using DistilBERT's tokenizer with truncation and padding applied. The label column is cast to ClassLabel format with three categories: entailment, neutral, and contradiction.

	Data Handling: Uses DataCollatorWithPadding for dynamic padding during training and tokenizes premise-hypothesis pairs jointly.


	#### Training Hyperparameters

	Learning Rate: 1e-5

	Batch Size: 64 (both training and evaluation)

	Number of Epochs: 5

	Weight Decay: 0.01

	Max Gradient Norm: 1.0

	Optimizer: AdamW (default)

	Evaluation Strategy: Every epoch

	Save Strategy: Every epoch

	Logging Steps: 100

	Best Model Selection: Based on validation accuracy (higher is better)

	## Evaluation

	### Metrics

	Accuracy: Primary evaluation metric measuring the percentage of correctly classified premise-hypothesis pairs across all three NLI categories.

	Precision (Macro-averaged): Secondary metric calculating the average precision across all three classes (entailment, neutral, contradiction), giving equal weight to each class regardless of support. This metric is useful for understanding model performance on each NLI relationship type, especially important when dealing with potentially imbalanced class distributions.

	Both metrics are computed using the evaluate library and rounded to 3 decimal places for reporting.