CambridgeMolecularEngineering
/

bert_finetuned_qa

English

Model card Files Files and versions

xet

Community

odysie commited on Sep 19, 2024

Commit

bba4f58

verified ·

1 Parent(s): 62a49dc

Update README.md with YAML metadata

Browse files

Files changed (1) hide show

README.md +147 -137

README.md CHANGED Viewed

@@ -1,138 +1,148 @@
-# Fine-Tuned BERT Models for Thermoelectric Materials Question Answering
-## Introduction
-This repository contains three BERT models fine-tuned for question-answering (QA) tasks related to thermoelectric materials. The models are trained on different datasets to evaluate their performance on specialised QA tasks in the field of materials science.
-We present a method for auto-generating a large question-answering dataset about thermoelectric materials for language model applications. The method was used to generate a dataset with sentence-wide contexts from a database of thermoelectric material records. The dataset was contrasted with SQuAD-v2, as well as the mixed combination of the two datasets. Hyperparameter optimisation was employed to fine-tune BERT models on each dataset, and the three best-performing models were then compared on a manually annotated test set of thermoelectric material paragraph contexts with questions spanning material names, five different properties, and temperatures during recording. The best BERT model fine-tuned on the mixed dataset outperforms the other two models when evaluated on the test dataset, indicating that mixing datasets with different semantic and syntactic scopes might be a beneficial approach to improving performance on specialised question-answering tasks.
-## Models Included
-1. **squad-v2_best**
-    Description: Fine-tuned on the SQuAD-v2 dataset, which is a widely used benchmark for QA tasks. \
-    Dataset: SQuAD-v2 \
-    Location: squad-v2_best/
-2. **te-cde_best**
-    Description: Fine-tuned on a thermoelectric materials-specific dataset generated using our auto-generation method. \
-    Dataset: Thermoelectric Materials QA Dataset (TE-CDE) \
-    Location: te-cde_best/
-3. **mixed_best**
-    Description: Fine-tuned on a mixed dataset combining SQuAD-v2 and the thermoelectric materials dataset to enhance performance on specialised QA tasks. \
-    Dataset: Combination of SQuAD-v2 and TE-CDE \
-    Location: mixed_best/
-## Dataset Details
-**SQuAD-v2**
-    A reading comprehension dataset consisting of questions posed by crowdworkers on a set of Wikipedia articles.
-    Some questions are unanswerable, adding complexity to the QA task.
-**Thermoelectric Materials QA Dataset (TE-CDE)**
-    Auto-generated dataset containing QA pairs about thermoelectric materials.
-    Contexts are sentence-wide excerpts from a database of thermoelectric material records.
-    Questions cover:
-        Material names
-        Five different properties
-        Temperatures during recording
-**Mixed Dataset**
-    A combination of SQuAD-v2 and TE-CDE datasets.
-    Aims to leverage the strengths of both general-purpose and domain-specific data.
-## Training Details
-    Base Model: BERT Base Uncased
-    Hyperparameter Optimisation: Employed to find the best-performing models for each dataset.
-    Training Parameters:
-        Epochs: Adjusted per dataset based on validation loss.
-        Batch Size: Optimized during training.
-        Learning Rate: Tuned using grid search.
-## Evaluation Metrics
-    Evaluation Dataset: Manually annotated test set of thermoelectric material paragraph contexts.
-    Metrics Used:
-        Exact Match (EM): Measures the percentage of predictions that match any one of the ground truth answers exactly.
-        F1 Score: Harmonic mean of precision and recall, considering overlap between the prediction and ground truth answers.
-### Performance Comparison
-Model	Exact Match (EM)	F1 Score
-squad-v2_best	57.60%	61.82%
-te-cde_best	65.39%	69.78%
-mixed_best	67.92%	72.29%
-## Usage Instructions
-### Installing Dependencies
-```bash
-pip install transformers
-```
-### Loading a Model
-Replace `model_name` with one of the following:
-    "odysie/bert-finetuned-qa-datasets/squad-v2_best"
-    "odysie/bert-finetuned-qa-datasets/te-cde_best"
-    "odysie/bert-finetuned-qa-datasets/mixed_best"
-```python
-from transformers import BertForQuestionAnswering, BertTokenizer
-model_name = "odysie/bert-finetuned-qa-datasets/mixed_best"
-tokenizer = BertTokenizer.from_pretrained(model_name)
-model = BertForQuestionAnswering.from_pretrained(model_name)
-# Example question and context
-question = "What is the chemical formula for water?"
-context = "Water is a molecule composed of two hydrogen atoms and one oxygen atom, with the chemical formula H2O."
-# Tokenize input
-inputs = tokenizer.encode_plus(question, context, return_tensors="pt")
-# Get model predictions
-outputs = model(**inputs)
-start_scores = outputs.start_logits
-end_scores = outputs.end_logits
-# Get the most likely beginning and end of answer with the argmax of the score
-start_index = start_scores.argmax()
-end_index = end_scores.argmax()
-# Convert tokens to answer
-tokens = inputs["input_ids"][0][start_index : end_index + 1]
-answer = tokenizer.decode(tokens)
-print(f"Answer: {answer}")
-```
-## License
-This project is licensed under Apache 2.0
-## Citation
-If you use these models in your research or application, please cite our work:
-bibtex
-(PENDING)
-@article{
-    ...
-}
-## Acknowledgments
 We thank the contributors of the SQuAD-v2 dataset and the developers of the Hugging Face Transformers library for providing valuable resources that made this work possible.

+---
+license: apache-2.0
+language:
+- en
+metrics:
+- exact_match
+- f1
+base_model:
+- google-bert/bert-base-uncased
+---
+# Fine-Tuned BERT Models for Thermoelectric Materials Question Answering
+## Introduction
+This repository contains three BERT models fine-tuned for question-answering (QA) tasks related to thermoelectric materials. The models are trained on different datasets to evaluate their performance on specialised QA tasks in the field of materials science.
+We present a method for auto-generating a large question-answering dataset about thermoelectric materials for language model applications. The method was used to generate a dataset with sentence-wide contexts from a database of thermoelectric material records. The dataset was contrasted with SQuAD-v2, as well as the mixed combination of the two datasets. Hyperparameter optimisation was employed to fine-tune BERT models on each dataset, and the three best-performing models were then compared on a manually annotated test set of thermoelectric material paragraph contexts with questions spanning material names, five different properties, and temperatures during recording. The best BERT model fine-tuned on the mixed dataset outperforms the other two models when evaluated on the test dataset, indicating that mixing datasets with different semantic and syntactic scopes might be a beneficial approach to improving performance on specialised question-answering tasks.
+## Models Included
+1. **squad-v2_best**
+    Description: Fine-tuned on the SQuAD-v2 dataset, which is a widely used benchmark for QA tasks. \
+    Dataset: SQuAD-v2 \
+    Location: squad-v2_best/
+2. **te-cde_best**
+    Description: Fine-tuned on a thermoelectric materials-specific dataset generated using our auto-generation method. \
+    Dataset: Thermoelectric Materials QA Dataset (TE-CDE) \
+    Location: te-cde_best/
+3. **mixed_best**
+    Description: Fine-tuned on a mixed dataset combining SQuAD-v2 and the thermoelectric materials dataset to enhance performance on specialised QA tasks. \
+    Dataset: Combination of SQuAD-v2 and TE-CDE \
+    Location: mixed_best/
+## Dataset Details
+**SQuAD-v2**
+    A reading comprehension dataset consisting of questions posed by crowdworkers on a set of Wikipedia articles.
+    Some questions are unanswerable, adding complexity to the QA task.
+**Thermoelectric Materials QA Dataset (TE-CDE)**
+    Auto-generated dataset containing QA pairs about thermoelectric materials.
+    Contexts are sentence-wide excerpts from a database of thermoelectric material records.
+    Questions cover:
+        Material names
+        Five different properties
+        Temperatures during recording
+**Mixed Dataset**
+    A combination of SQuAD-v2 and TE-CDE datasets.
+    Aims to leverage the strengths of both general-purpose and domain-specific data.
+## Training Details
+    Base Model: BERT Base Uncased
+    Hyperparameter Optimisation: Employed to find the best-performing models for each dataset.
+    Training Parameters:
+        Epochs: Adjusted per dataset based on validation loss.
+        Batch Size: Optimized during training.
+        Learning Rate: Tuned using grid search.
+## Evaluation Metrics
+    Evaluation Dataset: Manually annotated test set of thermoelectric material paragraph contexts.
+    Metrics Used:
+        Exact Match (EM): Measures the percentage of predictions that match any one of the ground truth answers exactly.
+        F1 Score: Harmonic mean of precision and recall, considering overlap between the prediction and ground truth answers.
+### Performance Comparison
+Model	Exact Match (EM)	F1 Score
+squad-v2_best	57.60%	61.82%
+te-cde_best	65.39%	69.78%
+mixed_best	67.92%	72.29%
+## Usage Instructions
+### Installing Dependencies
+```bash
+pip install transformers
+```
+### Loading a Model
+Replace `model_name` with one of the following:
+    "odysie/bert-finetuned-qa-datasets/squad-v2_best"
+    "odysie/bert-finetuned-qa-datasets/te-cde_best"
+    "odysie/bert-finetuned-qa-datasets/mixed_best"
+```python
+from transformers import BertForQuestionAnswering, BertTokenizer
+model_name = "odysie/bert-finetuned-qa-datasets/mixed_best"
+tokenizer = BertTokenizer.from_pretrained(model_name)
+model = BertForQuestionAnswering.from_pretrained(model_name)
+# Example question and context
+question = "What is the chemical formula for water?"
+context = "Water is a molecule composed of two hydrogen atoms and one oxygen atom, with the chemical formula H2O."
+# Tokenize input
+inputs = tokenizer.encode_plus(question, context, return_tensors="pt")
+# Get model predictions
+outputs = model(**inputs)
+start_scores = outputs.start_logits
+end_scores = outputs.end_logits
+# Get the most likely beginning and end of answer with the argmax of the score
+start_index = start_scores.argmax()
+end_index = end_scores.argmax()
+# Convert tokens to answer
+tokens = inputs["input_ids"][0][start_index : end_index + 1]
+answer = tokenizer.decode(tokens)
+print(f"Answer: {answer}")
+```
+## License
+This project is licensed under Apache 2.0
+## Citation
+If you use these models in your research or application, please cite our work:
+bibtex
+(PENDING)
+@article{
+    ...
+}
+## Acknowledgments
 We thank the contributors of the SQuAD-v2 dataset and the developers of the Hugging Face Transformers library for providing valuable resources that made this work possible.