odysie commited on
Commit
bba4f58
·
verified ·
1 Parent(s): 62a49dc

Update README.md with YAML metadata

Browse files
Files changed (1) hide show
  1. README.md +147 -137
README.md CHANGED
@@ -1,138 +1,148 @@
1
- # Fine-Tuned BERT Models for Thermoelectric Materials Question Answering
2
-
3
- ## Introduction
4
-
5
- This repository contains three BERT models fine-tuned for question-answering (QA) tasks related to thermoelectric materials. The models are trained on different datasets to evaluate their performance on specialised QA tasks in the field of materials science.
6
-
7
- We present a method for auto-generating a large question-answering dataset about thermoelectric materials for language model applications. The method was used to generate a dataset with sentence-wide contexts from a database of thermoelectric material records. The dataset was contrasted with SQuAD-v2, as well as the mixed combination of the two datasets. Hyperparameter optimisation was employed to fine-tune BERT models on each dataset, and the three best-performing models were then compared on a manually annotated test set of thermoelectric material paragraph contexts with questions spanning material names, five different properties, and temperatures during recording. The best BERT model fine-tuned on the mixed dataset outperforms the other two models when evaluated on the test dataset, indicating that mixing datasets with different semantic and syntactic scopes might be a beneficial approach to improving performance on specialised question-answering tasks.
8
-
9
- ## Models Included
10
-
11
- 1. **squad-v2_best**
12
-
13
- Description: Fine-tuned on the SQuAD-v2 dataset, which is a widely used benchmark for QA tasks. \
14
- Dataset: SQuAD-v2 \
15
- Location: squad-v2_best/
16
-
17
- 2. **te-cde_best**
18
-
19
- Description: Fine-tuned on a thermoelectric materials-specific dataset generated using our auto-generation method. \
20
- Dataset: Thermoelectric Materials QA Dataset (TE-CDE) \
21
- Location: te-cde_best/
22
-
23
- 3. **mixed_best**
24
-
25
- Description: Fine-tuned on a mixed dataset combining SQuAD-v2 and the thermoelectric materials dataset to enhance performance on specialised QA tasks. \
26
- Dataset: Combination of SQuAD-v2 and TE-CDE \
27
- Location: mixed_best/
28
-
29
- ## Dataset Details
30
-
31
- **SQuAD-v2**
32
-
33
- A reading comprehension dataset consisting of questions posed by crowdworkers on a set of Wikipedia articles.
34
- Some questions are unanswerable, adding complexity to the QA task.
35
-
36
- **Thermoelectric Materials QA Dataset (TE-CDE)**
37
-
38
- Auto-generated dataset containing QA pairs about thermoelectric materials.
39
- Contexts are sentence-wide excerpts from a database of thermoelectric material records.
40
- Questions cover:
41
- Material names
42
- Five different properties
43
- Temperatures during recording
44
-
45
- **Mixed Dataset**
46
-
47
- A combination of SQuAD-v2 and TE-CDE datasets.
48
- Aims to leverage the strengths of both general-purpose and domain-specific data.
49
-
50
- ## Training Details
51
-
52
- Base Model: BERT Base Uncased
53
- Hyperparameter Optimisation: Employed to find the best-performing models for each dataset.
54
- Training Parameters:
55
- Epochs: Adjusted per dataset based on validation loss.
56
- Batch Size: Optimized during training.
57
- Learning Rate: Tuned using grid search.
58
-
59
- ## Evaluation Metrics
60
-
61
- Evaluation Dataset: Manually annotated test set of thermoelectric material paragraph contexts.
62
- Metrics Used:
63
- Exact Match (EM): Measures the percentage of predictions that match any one of the ground truth answers exactly.
64
- F1 Score: Harmonic mean of precision and recall, considering overlap between the prediction and ground truth answers.
65
-
66
- ### Performance Comparison
67
- Model Exact Match (EM) F1 Score
68
- squad-v2_best 57.60% 61.82%
69
- te-cde_best 65.39% 69.78%
70
- mixed_best 67.92% 72.29%
71
-
72
- ## Usage Instructions
73
-
74
- ### Installing Dependencies
75
-
76
- ```bash
77
- pip install transformers
78
- ```
79
-
80
- ### Loading a Model
81
-
82
- Replace `model_name` with one of the following:
83
-
84
- "odysie/bert-finetuned-qa-datasets/squad-v2_best"
85
- "odysie/bert-finetuned-qa-datasets/te-cde_best"
86
- "odysie/bert-finetuned-qa-datasets/mixed_best"
87
-
88
- ```python
89
- from transformers import BertForQuestionAnswering, BertTokenizer
90
-
91
- model_name = "odysie/bert-finetuned-qa-datasets/mixed_best"
92
-
93
- tokenizer = BertTokenizer.from_pretrained(model_name)
94
- model = BertForQuestionAnswering.from_pretrained(model_name)
95
-
96
- # Example question and context
97
- question = "What is the chemical formula for water?"
98
- context = "Water is a molecule composed of two hydrogen atoms and one oxygen atom, with the chemical formula H2O."
99
-
100
- # Tokenize input
101
- inputs = tokenizer.encode_plus(question, context, return_tensors="pt")
102
-
103
- # Get model predictions
104
- outputs = model(**inputs)
105
- start_scores = outputs.start_logits
106
- end_scores = outputs.end_logits
107
-
108
- # Get the most likely beginning and end of answer with the argmax of the score
109
- start_index = start_scores.argmax()
110
- end_index = end_scores.argmax()
111
-
112
- # Convert tokens to answer
113
- tokens = inputs["input_ids"][0][start_index : end_index + 1]
114
- answer = tokenizer.decode(tokens)
115
-
116
- print(f"Answer: {answer}")
117
- ```
118
-
119
- ## License
120
-
121
- This project is licensed under Apache 2.0
122
-
123
-
124
- ## Citation
125
-
126
- If you use these models in your research or application, please cite our work:
127
-
128
- bibtex
129
-
130
- (PENDING)
131
-
132
- @article{
133
- ...
134
- }
135
-
136
- ## Acknowledgments
137
-
 
 
 
 
 
 
 
 
 
 
138
  We thank the contributors of the SQuAD-v2 dataset and the developers of the Hugging Face Transformers library for providing valuable resources that made this work possible.
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ metrics:
6
+ - exact_match
7
+ - f1
8
+ base_model:
9
+ - google-bert/bert-base-uncased
10
+ ---
11
+ # Fine-Tuned BERT Models for Thermoelectric Materials Question Answering
12
+
13
+ ## Introduction
14
+
15
+ This repository contains three BERT models fine-tuned for question-answering (QA) tasks related to thermoelectric materials. The models are trained on different datasets to evaluate their performance on specialised QA tasks in the field of materials science.
16
+
17
+ We present a method for auto-generating a large question-answering dataset about thermoelectric materials for language model applications. The method was used to generate a dataset with sentence-wide contexts from a database of thermoelectric material records. The dataset was contrasted with SQuAD-v2, as well as the mixed combination of the two datasets. Hyperparameter optimisation was employed to fine-tune BERT models on each dataset, and the three best-performing models were then compared on a manually annotated test set of thermoelectric material paragraph contexts with questions spanning material names, five different properties, and temperatures during recording. The best BERT model fine-tuned on the mixed dataset outperforms the other two models when evaluated on the test dataset, indicating that mixing datasets with different semantic and syntactic scopes might be a beneficial approach to improving performance on specialised question-answering tasks.
18
+
19
+ ## Models Included
20
+
21
+ 1. **squad-v2_best**
22
+
23
+ Description: Fine-tuned on the SQuAD-v2 dataset, which is a widely used benchmark for QA tasks. \
24
+ Dataset: SQuAD-v2 \
25
+ Location: squad-v2_best/
26
+
27
+ 2. **te-cde_best**
28
+
29
+ Description: Fine-tuned on a thermoelectric materials-specific dataset generated using our auto-generation method. \
30
+ Dataset: Thermoelectric Materials QA Dataset (TE-CDE) \
31
+ Location: te-cde_best/
32
+
33
+ 3. **mixed_best**
34
+
35
+ Description: Fine-tuned on a mixed dataset combining SQuAD-v2 and the thermoelectric materials dataset to enhance performance on specialised QA tasks. \
36
+ Dataset: Combination of SQuAD-v2 and TE-CDE \
37
+ Location: mixed_best/
38
+
39
+ ## Dataset Details
40
+
41
+ **SQuAD-v2**
42
+
43
+ A reading comprehension dataset consisting of questions posed by crowdworkers on a set of Wikipedia articles.
44
+ Some questions are unanswerable, adding complexity to the QA task.
45
+
46
+ **Thermoelectric Materials QA Dataset (TE-CDE)**
47
+
48
+ Auto-generated dataset containing QA pairs about thermoelectric materials.
49
+ Contexts are sentence-wide excerpts from a database of thermoelectric material records.
50
+ Questions cover:
51
+ Material names
52
+ Five different properties
53
+ Temperatures during recording
54
+
55
+ **Mixed Dataset**
56
+
57
+ A combination of SQuAD-v2 and TE-CDE datasets.
58
+ Aims to leverage the strengths of both general-purpose and domain-specific data.
59
+
60
+ ## Training Details
61
+
62
+ Base Model: BERT Base Uncased
63
+ Hyperparameter Optimisation: Employed to find the best-performing models for each dataset.
64
+ Training Parameters:
65
+ Epochs: Adjusted per dataset based on validation loss.
66
+ Batch Size: Optimized during training.
67
+ Learning Rate: Tuned using grid search.
68
+
69
+ ## Evaluation Metrics
70
+
71
+ Evaluation Dataset: Manually annotated test set of thermoelectric material paragraph contexts.
72
+ Metrics Used:
73
+ Exact Match (EM): Measures the percentage of predictions that match any one of the ground truth answers exactly.
74
+ F1 Score: Harmonic mean of precision and recall, considering overlap between the prediction and ground truth answers.
75
+
76
+ ### Performance Comparison
77
+ Model Exact Match (EM) F1 Score
78
+ squad-v2_best 57.60% 61.82%
79
+ te-cde_best 65.39% 69.78%
80
+ mixed_best 67.92% 72.29%
81
+
82
+ ## Usage Instructions
83
+
84
+ ### Installing Dependencies
85
+
86
+ ```bash
87
+ pip install transformers
88
+ ```
89
+
90
+ ### Loading a Model
91
+
92
+ Replace `model_name` with one of the following:
93
+
94
+ "odysie/bert-finetuned-qa-datasets/squad-v2_best"
95
+ "odysie/bert-finetuned-qa-datasets/te-cde_best"
96
+ "odysie/bert-finetuned-qa-datasets/mixed_best"
97
+
98
+ ```python
99
+ from transformers import BertForQuestionAnswering, BertTokenizer
100
+
101
+ model_name = "odysie/bert-finetuned-qa-datasets/mixed_best"
102
+
103
+ tokenizer = BertTokenizer.from_pretrained(model_name)
104
+ model = BertForQuestionAnswering.from_pretrained(model_name)
105
+
106
+ # Example question and context
107
+ question = "What is the chemical formula for water?"
108
+ context = "Water is a molecule composed of two hydrogen atoms and one oxygen atom, with the chemical formula H2O."
109
+
110
+ # Tokenize input
111
+ inputs = tokenizer.encode_plus(question, context, return_tensors="pt")
112
+
113
+ # Get model predictions
114
+ outputs = model(**inputs)
115
+ start_scores = outputs.start_logits
116
+ end_scores = outputs.end_logits
117
+
118
+ # Get the most likely beginning and end of answer with the argmax of the score
119
+ start_index = start_scores.argmax()
120
+ end_index = end_scores.argmax()
121
+
122
+ # Convert tokens to answer
123
+ tokens = inputs["input_ids"][0][start_index : end_index + 1]
124
+ answer = tokenizer.decode(tokens)
125
+
126
+ print(f"Answer: {answer}")
127
+ ```
128
+
129
+ ## License
130
+
131
+ This project is licensed under Apache 2.0
132
+
133
+
134
+ ## Citation
135
+
136
+ If you use these models in your research or application, please cite our work:
137
+
138
+ bibtex
139
+
140
+ (PENDING)
141
+
142
+ @article{
143
+ ...
144
+ }
145
+
146
+ ## Acknowledgments
147
+
148
  We thank the contributors of the SQuAD-v2 dataset and the developers of the Hugging Face Transformers library for providing valuable resources that made this work possible.