bert_finetuned_qa / README.md

Update README.md with YAML metadata

bba4f58 verified over 1 year ago

5.36 kB

	---
	license: apache-2.0
	language:
	- en
	metrics:
	- exact_match
	- f1
	base_model:
	- google-bert/bert-base-uncased
	---
	# Fine-Tuned BERT Models for Thermoelectric Materials Question Answering

	## Introduction

	This repository contains three BERT models fine-tuned for question-answering (QA) tasks related to thermoelectric materials. The models are trained on different datasets to evaluate their performance on specialised QA tasks in the field of materials science.

	We present a method for auto-generating a large question-answering dataset about thermoelectric materials for language model applications. The method was used to generate a dataset with sentence-wide contexts from a database of thermoelectric material records. The dataset was contrasted with SQuAD-v2, as well as the mixed combination of the two datasets. Hyperparameter optimisation was employed to fine-tune BERT models on each dataset, and the three best-performing models were then compared on a manually annotated test set of thermoelectric material paragraph contexts with questions spanning material names, five different properties, and temperatures during recording. The best BERT model fine-tuned on the mixed dataset outperforms the other two models when evaluated on the test dataset, indicating that mixing datasets with different semantic and syntactic scopes might be a beneficial approach to improving performance on specialised question-answering tasks.

	## Models Included

	1. squad-v2_best

	Description: Fine-tuned on the SQuAD-v2 dataset, which is a widely used benchmark for QA tasks. \
	Dataset: SQuAD-v2 \
	Location: squad-v2_best/

	2. te-cde_best

	Description: Fine-tuned on a thermoelectric materials-specific dataset generated using our auto-generation method. \
	Dataset: Thermoelectric Materials QA Dataset (TE-CDE) \
	Location: te-cde_best/

	3. mixed_best

	Description: Fine-tuned on a mixed dataset combining SQuAD-v2 and the thermoelectric materials dataset to enhance performance on specialised QA tasks. \
	Dataset: Combination of SQuAD-v2 and TE-CDE \
	Location: mixed_best/

	## Dataset Details

	SQuAD-v2

	A reading comprehension dataset consisting of questions posed by crowdworkers on a set of Wikipedia articles.
	Some questions are unanswerable, adding complexity to the QA task.

	Thermoelectric Materials QA Dataset (TE-CDE)

	Auto-generated dataset containing QA pairs about thermoelectric materials.
	Contexts are sentence-wide excerpts from a database of thermoelectric material records.
	Questions cover:
	Material names
	Five different properties
	Temperatures during recording

	Mixed Dataset

	A combination of SQuAD-v2 and TE-CDE datasets.
	Aims to leverage the strengths of both general-purpose and domain-specific data.

	## Training Details

	Base Model: BERT Base Uncased
	Hyperparameter Optimisation: Employed to find the best-performing models for each dataset.
	Training Parameters:
	Epochs: Adjusted per dataset based on validation loss.
	Batch Size: Optimized during training.
	Learning Rate: Tuned using grid search.

	## Evaluation Metrics

	Evaluation Dataset: Manually annotated test set of thermoelectric material paragraph contexts.
	Metrics Used:
	Exact Match (EM): Measures the percentage of predictions that match any one of the ground truth answers exactly.
	F1 Score: Harmonic mean of precision and recall, considering overlap between the prediction and ground truth answers.

	### Performance Comparison
	Model Exact Match (EM) F1 Score
	squad-v2_best 57.60% 61.82%
	te-cde_best 65.39% 69.78%
	mixed_best 67.92% 72.29%

	## Usage Instructions

	### Installing Dependencies

	```bash
	pip install transformers
	```

	### Loading a Model

	Replace `model_name` with one of the following:

	"odysie/bert-finetuned-qa-datasets/squad-v2_best"
	"odysie/bert-finetuned-qa-datasets/te-cde_best"
	"odysie/bert-finetuned-qa-datasets/mixed_best"

	```python
	from transformers import BertForQuestionAnswering, BertTokenizer

	model_name = "odysie/bert-finetuned-qa-datasets/mixed_best"

	tokenizer = BertTokenizer.from_pretrained(model_name)
	model = BertForQuestionAnswering.from_pretrained(model_name)

	# Example question and context
	question = "What is the chemical formula for water?"
	context = "Water is a molecule composed of two hydrogen atoms and one oxygen atom, with the chemical formula H2O."

	# Tokenize input
	inputs = tokenizer.encode_plus(question, context, return_tensors="pt")

	# Get model predictions
	outputs = model(**inputs)
	start_scores = outputs.start_logits
	end_scores = outputs.end_logits

	# Get the most likely beginning and end of answer with the argmax of the score
	start_index = start_scores.argmax()
	end_index = end_scores.argmax()

	# Convert tokens to answer
	tokens = inputs["input_ids"][0][start_index : end_index + 1]
	answer = tokenizer.decode(tokens)

	print(f"Answer: {answer}")
	```

	## License

	This project is licensed under Apache 2.0


	## Citation

	If you use these models in your research or application, please cite our work:

	bibtex

	(PENDING)

	@article{
	...
	}

	## Acknowledgments

	We thank the contributors of the SQuAD-v2 dataset and the developers of the Hugging Face Transformers library for providing valuable resources that made this work possible.