Update README.md

8140503 verified 5 months ago

4.43 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: distilbert/distilbert-base-cased
	tags:
	- question-answering
	- squadv2
	- distilbert
	- en
	- transformer
	- pytorch
	model-index:
	- name: a-question_answerer
	results: []
	language:
	- en
	---

	# a-question_answerer

	This model is a fine-tuned version of the DistilBERT Base Cased model on the SQuAD v2 dataset for the Question Answering task.

	It was trained as part of a Google Colab project aimed at adapting a pre-trained language model to answer questions based on a given text context.

	This model is a fine-tuned version of [distilbert/distilbert-base-cased](https://huggingface.co/distilbert/distilbert-base-cased) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.3622*

	## Model description

	This model is intended for use in Question Answering tasks, where the goal is to extract a concise answer span from a provided text context given a natural language question. It can handle both answerable and unanswerable questions as per the SQuAD v2 dataset format.

	## Intended uses & limitations

	Potential use cases include:

	Building a simple document Q-A system.
	Enhancing search functionalities to provide direct answers.

	As with any model trained on a specific dataset, this model's performance is influenced by the characteristics and potential biases present in the SQuAD v2 dataset.
	It may perform differently on text from domains significantly different from Wikipedia articles (the source of SQuAD).
	The model may also inherit biases from the original DistilBERT Base Cased model.

	The model's performance on identifying and answering questions depends heavily on the quality and relevance of the provided context.

	## Training and evaluation data

	The model was fine-tuned on the SQuAD v2 dataset, which contains over 130,000 question-answer pairs derived from Wikipedia articles.
	The dataset includes questions that are unanswerable, requiring the model to determine if no answer exists within the provided text.

	For the final reported results, the model was trained on the full SQuAD v2 training dataset.

	## Training procedure

	The model was fine-tuned using the Hugging Face transformers library and Trainer API.
	The training process involved tokenizing the dataset, preparing input features with start and end positions for answers, and using DataCollatorWithPadding.
	Early stopping was used to load the model checkpoint with the lowest validation loss.

	Training Arguments:

	Learning Rate: 2e-5
	Per Device Train Batch Size: 4
	Per Device Eval Batch Size: 4
	Number of Epochs: 3
	Weight Decay: 0.1
	Evaluation Strategy: epoch
	Save Strategy: epoch
	Early Stopping: Enabled (load_best_model_at_end=True, metric_for_best_model="eval_loss")

	### Training hyperparameters

	Base Model: distilbert/distilbert-base-cased
	Dataset: SQuAD v2
	Early Stopping: Enabled (load_best_model_at_end=True, metric_for_best_model="eval_loss")

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 4
	- eval_batch_size: 4
	- weight_decay: 0.1
	- seed: 42
	- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- eval_strategy: epoch
	- save_strategy: epoch
	- load_best_model_at_end=True
	- metric_for_best_model="eval_loss"
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 1.1921 \| 1.0 \| 32580 \| 1.4150 \|
	\| 0.9637 \| 2.0 \| 65160 \| 1.3622* \|
	\| 0.6474 \| 3.0 \| 97740 \| 1.8661 \|

	## Evaluation Results

	The model was evaluated on the SQuAD v2 validation set. The following metrics were obtained:

	\| Metric \| Overall \| Answerable (HasAns) \| Unanswerable (NoAns) \|
	\|----------------\|-----------\|---------------------\|----------------------\|
	\| Exact match-EM \| 64.2 \| 60.27 \| 67.97 \|
	\| F1 Score \| 66.57 \| 65.10 \| 67.97 \|
	\| Total Examples \| 2000 \| 979 \| 1021 \|

	Note: The metrics for 'Answerable' and 'Unanswerable' questions provide a more detailed view of the model's performance on each type of question in SQuAD v2.

	### Framework versions

	- Transformers 4.55.0
	- Pytorch 2.6.0+cu124
	- Datasets 4.0.0
	- Tokenizers 0.21.4