Update README.md with YAML metadata
Browse files
README.md
CHANGED
|
@@ -1,138 +1,148 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
```
|
| 89 |
-
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
|
| 93 |
-
|
| 94 |
-
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
|
| 110 |
-
|
| 111 |
-
|
| 112 |
-
|
| 113 |
-
|
| 114 |
-
|
| 115 |
-
|
| 116 |
-
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
|
| 128 |
-
|
| 129 |
-
|
| 130 |
-
|
| 131 |
-
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
|
| 135 |
-
|
| 136 |
-
|
| 137 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 138 |
We thank the contributors of the SQuAD-v2 dataset and the developers of the Hugging Face Transformers library for providing valuable resources that made this work possible.
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
metrics:
|
| 6 |
+
- exact_match
|
| 7 |
+
- f1
|
| 8 |
+
base_model:
|
| 9 |
+
- google-bert/bert-base-uncased
|
| 10 |
+
---
|
| 11 |
+
# Fine-Tuned BERT Models for Thermoelectric Materials Question Answering
|
| 12 |
+
|
| 13 |
+
## Introduction
|
| 14 |
+
|
| 15 |
+
This repository contains three BERT models fine-tuned for question-answering (QA) tasks related to thermoelectric materials. The models are trained on different datasets to evaluate their performance on specialised QA tasks in the field of materials science.
|
| 16 |
+
|
| 17 |
+
We present a method for auto-generating a large question-answering dataset about thermoelectric materials for language model applications. The method was used to generate a dataset with sentence-wide contexts from a database of thermoelectric material records. The dataset was contrasted with SQuAD-v2, as well as the mixed combination of the two datasets. Hyperparameter optimisation was employed to fine-tune BERT models on each dataset, and the three best-performing models were then compared on a manually annotated test set of thermoelectric material paragraph contexts with questions spanning material names, five different properties, and temperatures during recording. The best BERT model fine-tuned on the mixed dataset outperforms the other two models when evaluated on the test dataset, indicating that mixing datasets with different semantic and syntactic scopes might be a beneficial approach to improving performance on specialised question-answering tasks.
|
| 18 |
+
|
| 19 |
+
## Models Included
|
| 20 |
+
|
| 21 |
+
1. **squad-v2_best**
|
| 22 |
+
|
| 23 |
+
Description: Fine-tuned on the SQuAD-v2 dataset, which is a widely used benchmark for QA tasks. \
|
| 24 |
+
Dataset: SQuAD-v2 \
|
| 25 |
+
Location: squad-v2_best/
|
| 26 |
+
|
| 27 |
+
2. **te-cde_best**
|
| 28 |
+
|
| 29 |
+
Description: Fine-tuned on a thermoelectric materials-specific dataset generated using our auto-generation method. \
|
| 30 |
+
Dataset: Thermoelectric Materials QA Dataset (TE-CDE) \
|
| 31 |
+
Location: te-cde_best/
|
| 32 |
+
|
| 33 |
+
3. **mixed_best**
|
| 34 |
+
|
| 35 |
+
Description: Fine-tuned on a mixed dataset combining SQuAD-v2 and the thermoelectric materials dataset to enhance performance on specialised QA tasks. \
|
| 36 |
+
Dataset: Combination of SQuAD-v2 and TE-CDE \
|
| 37 |
+
Location: mixed_best/
|
| 38 |
+
|
| 39 |
+
## Dataset Details
|
| 40 |
+
|
| 41 |
+
**SQuAD-v2**
|
| 42 |
+
|
| 43 |
+
A reading comprehension dataset consisting of questions posed by crowdworkers on a set of Wikipedia articles.
|
| 44 |
+
Some questions are unanswerable, adding complexity to the QA task.
|
| 45 |
+
|
| 46 |
+
**Thermoelectric Materials QA Dataset (TE-CDE)**
|
| 47 |
+
|
| 48 |
+
Auto-generated dataset containing QA pairs about thermoelectric materials.
|
| 49 |
+
Contexts are sentence-wide excerpts from a database of thermoelectric material records.
|
| 50 |
+
Questions cover:
|
| 51 |
+
Material names
|
| 52 |
+
Five different properties
|
| 53 |
+
Temperatures during recording
|
| 54 |
+
|
| 55 |
+
**Mixed Dataset**
|
| 56 |
+
|
| 57 |
+
A combination of SQuAD-v2 and TE-CDE datasets.
|
| 58 |
+
Aims to leverage the strengths of both general-purpose and domain-specific data.
|
| 59 |
+
|
| 60 |
+
## Training Details
|
| 61 |
+
|
| 62 |
+
Base Model: BERT Base Uncased
|
| 63 |
+
Hyperparameter Optimisation: Employed to find the best-performing models for each dataset.
|
| 64 |
+
Training Parameters:
|
| 65 |
+
Epochs: Adjusted per dataset based on validation loss.
|
| 66 |
+
Batch Size: Optimized during training.
|
| 67 |
+
Learning Rate: Tuned using grid search.
|
| 68 |
+
|
| 69 |
+
## Evaluation Metrics
|
| 70 |
+
|
| 71 |
+
Evaluation Dataset: Manually annotated test set of thermoelectric material paragraph contexts.
|
| 72 |
+
Metrics Used:
|
| 73 |
+
Exact Match (EM): Measures the percentage of predictions that match any one of the ground truth answers exactly.
|
| 74 |
+
F1 Score: Harmonic mean of precision and recall, considering overlap between the prediction and ground truth answers.
|
| 75 |
+
|
| 76 |
+
### Performance Comparison
|
| 77 |
+
Model Exact Match (EM) F1 Score
|
| 78 |
+
squad-v2_best 57.60% 61.82%
|
| 79 |
+
te-cde_best 65.39% 69.78%
|
| 80 |
+
mixed_best 67.92% 72.29%
|
| 81 |
+
|
| 82 |
+
## Usage Instructions
|
| 83 |
+
|
| 84 |
+
### Installing Dependencies
|
| 85 |
+
|
| 86 |
+
```bash
|
| 87 |
+
pip install transformers
|
| 88 |
+
```
|
| 89 |
+
|
| 90 |
+
### Loading a Model
|
| 91 |
+
|
| 92 |
+
Replace `model_name` with one of the following:
|
| 93 |
+
|
| 94 |
+
"odysie/bert-finetuned-qa-datasets/squad-v2_best"
|
| 95 |
+
"odysie/bert-finetuned-qa-datasets/te-cde_best"
|
| 96 |
+
"odysie/bert-finetuned-qa-datasets/mixed_best"
|
| 97 |
+
|
| 98 |
+
```python
|
| 99 |
+
from transformers import BertForQuestionAnswering, BertTokenizer
|
| 100 |
+
|
| 101 |
+
model_name = "odysie/bert-finetuned-qa-datasets/mixed_best"
|
| 102 |
+
|
| 103 |
+
tokenizer = BertTokenizer.from_pretrained(model_name)
|
| 104 |
+
model = BertForQuestionAnswering.from_pretrained(model_name)
|
| 105 |
+
|
| 106 |
+
# Example question and context
|
| 107 |
+
question = "What is the chemical formula for water?"
|
| 108 |
+
context = "Water is a molecule composed of two hydrogen atoms and one oxygen atom, with the chemical formula H2O."
|
| 109 |
+
|
| 110 |
+
# Tokenize input
|
| 111 |
+
inputs = tokenizer.encode_plus(question, context, return_tensors="pt")
|
| 112 |
+
|
| 113 |
+
# Get model predictions
|
| 114 |
+
outputs = model(**inputs)
|
| 115 |
+
start_scores = outputs.start_logits
|
| 116 |
+
end_scores = outputs.end_logits
|
| 117 |
+
|
| 118 |
+
# Get the most likely beginning and end of answer with the argmax of the score
|
| 119 |
+
start_index = start_scores.argmax()
|
| 120 |
+
end_index = end_scores.argmax()
|
| 121 |
+
|
| 122 |
+
# Convert tokens to answer
|
| 123 |
+
tokens = inputs["input_ids"][0][start_index : end_index + 1]
|
| 124 |
+
answer = tokenizer.decode(tokens)
|
| 125 |
+
|
| 126 |
+
print(f"Answer: {answer}")
|
| 127 |
+
```
|
| 128 |
+
|
| 129 |
+
## License
|
| 130 |
+
|
| 131 |
+
This project is licensed under Apache 2.0
|
| 132 |
+
|
| 133 |
+
|
| 134 |
+
## Citation
|
| 135 |
+
|
| 136 |
+
If you use these models in your research or application, please cite our work:
|
| 137 |
+
|
| 138 |
+
bibtex
|
| 139 |
+
|
| 140 |
+
(PENDING)
|
| 141 |
+
|
| 142 |
+
@article{
|
| 143 |
+
...
|
| 144 |
+
}
|
| 145 |
+
|
| 146 |
+
## Acknowledgments
|
| 147 |
+
|
| 148 |
We thank the contributors of the SQuAD-v2 dataset and the developers of the Hugging Face Transformers library for providing valuable resources that made this work possible.
|