jedick commited on
Commit
6ecddd5
·
1 Parent(s): c313a1f

Adjust app layout

Browse files
Files changed (3) hide show
  1. app.py +54 -50
  2. examples/Refute/log.csv +1 -1
  3. requirements.txt +3 -1
app.py CHANGED
@@ -85,7 +85,7 @@ with gr.Blocks() as demo:
85
  choices=["BM25S", "DeBERTa", "GPT"],
86
  value="BM25S",
87
  label="Retrieval Method",
88
- info="Keyword search (BM25S) or AI (DeBERTa, GPT)",
89
  )
90
  top_k = gr.Slider(
91
  1,
@@ -106,42 +106,22 @@ with gr.Blocks() as demo:
106
  completion_tokens = gr.Number(
107
  label="Completion tokens", visible=False
108
  )
109
- with gr.Accordion("More info", open=True):
110
- gr.Markdown(
111
- """
112
- #### *MLE capstone project*
113
- - <i class="fa-brands fa-github"></i> [jedick/MLE-capstone-project](https://github.com/jedick/MLE-capstone-project) (project repo)
114
- - <i class="fa-brands fa-github"></i> [jedick/AI4citations](https://github.com/jedick/AI4citations) (app repo)
115
- #### *Text classification*
116
- - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [jedick/DeBERTa-v3-base-mnli-fever-anli-scifact-citint](https://huggingface.co/jedick/DeBERTa-v3-base-mnli-fever-anli-scifact-citint) (fine-tuned)
117
- - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli](https://huggingface.co/MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli) (base)
118
- #### *Evidence retrieval*
119
- - <i class="fa-brands fa-github"></i> [xhluca/bm25s](https://github.com/xhluca/bm25s) (BM25S)
120
- - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [deepset/deberta-v3-large-squad2](https://huggingface.co/deepset/deberta-v3-large-squad2) (DeBERTa)
121
- - <img src="https://upload.wikimedia.org/wikipedia/commons/4/4d/OpenAI_Logo.svg" style="height: 1.2em; display: inline-block;"> [gpt-4o-mini-2024-07-18](https://platform.openai.com/docs/pricing) (GPT)
122
- #### *Datasets for fine-tuning*
123
- - <i class="fa-brands fa-github"></i> [allenai/SciFact](https://github.com/allenai/scifact) (SciFact)
124
- - <i class="fa-brands fa-github"></i> [ScienceNLP-Lab/Citation-Integrity](https://github.com/ScienceNLP-Lab/Citation-Integrity) (CitInt)
125
- #### *Other sources*
126
- - <img src="https://plos.org/wp-content/uploads/2020/01/logo-color-blue.svg" style="height: 1.4em; display: inline-block;"> [Medicine](https://doi.org/10.1371/journal.pmed.0030197), <i class="fa-brands fa-wikipedia-w"></i> [CRISPR](https://en.wikipedia.org/wiki/CRISPR) (evidence retrieval examples)
127
- - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [nyu-mll/multi_nli](https://huggingface.co/datasets/nyu-mll/multi_nli/viewer/default/train?row=37&views%5B%5D=train) (MNLI example)
128
- - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [NoCrypt/miku](https://huggingface.co/spaces/NoCrypt/miku) (theme)
129
- """
130
- )
131
 
132
  with gr.Column(scale=2):
133
- prediction = gr.Label(label="Prediction")
134
- with gr.Accordion("Feedback"):
135
- gr.Markdown(
136
- "*Provide the correct label to help improve this app*<br>**NOTE:** The claim and evidence will also be saved"
137
- ),
138
- with gr.Row():
139
- flag_support = gr.Button("Support")
140
- flag_nei = gr.Button("NEI")
141
- flag_refute = gr.Button("Refute")
142
- gr.Markdown(
143
- "Feedback is uploaded every 5 minutes to [AI4citations-feedback](https://huggingface.co/datasets/jedick/AI4citations-feedback)"
144
- ),
 
145
  with gr.Accordion("Examples"):
146
  gr.Markdown("*Examples are run when clicked*"),
147
  with gr.Row():
@@ -177,19 +157,43 @@ with gr.Blocks() as demo:
177
  "label"
178
  ].tolist(),
179
  )
180
- # Create dropdown menu to select the model
181
- model = gr.Dropdown(
182
- choices=[
183
- # TODO: For bert-base-uncased, how can we set num_labels = 2 in HF pipeline?
184
- # (num_labels is available in AutoModelForSequenceClassification.from_pretrained)
185
- # "bert-base-uncased",
186
- "MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli",
187
- "jedick/DeBERTa-v3-base-mnli-fever-anli-scifact-citint",
188
- ],
189
- value=MODEL_NAME,
190
- label="Model",
191
- info="Text classification model used for claim verification",
192
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
193
 
194
  # Functions
195
 
@@ -258,7 +262,7 @@ with gr.Blocks() as demo:
258
  pdf_file = f"examples/retrieval/{pdf_file}"
259
  return pdf_file, claim
260
 
261
- @spaces.GPU()
262
  def _retrieve_with_deberta(pdf_file, claim, top_k):
263
  """
264
  Retrieve evidence using DeBERTa
@@ -481,7 +485,7 @@ with gr.Blocks() as demo:
481
 
482
  if __name__ == "__main__":
483
  # allowed_paths is needed to upload PDFs from specific example directory
484
- allowed_paths=[f"{os.getcwd()}/examples/retrieval"]
485
 
486
  # Setup theme without background image
487
  theme = gr.Theme.from_hub("NoCrypt/miku")
 
85
  choices=["BM25S", "DeBERTa", "GPT"],
86
  value="BM25S",
87
  label="Retrieval Method",
88
+ info="Lexical (BM25S) or semantic (DeBERTa, GPT)",
89
  )
90
  top_k = gr.Slider(
91
  1,
 
106
  completion_tokens = gr.Number(
107
  label="Completion tokens", visible=False
108
  )
109
+ prediction = gr.Label(label="Prediction")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
110
 
111
  with gr.Column(scale=2):
112
+ # Create dropdown menu to select the model
113
+ model = gr.Dropdown(
114
+ choices=[
115
+ # TODO: For bert-base-uncased, how can we set num_labels = 2 in HF pipeline?
116
+ # (num_labels is available in AutoModelForSequenceClassification.from_pretrained)
117
+ # "bert-base-uncased",
118
+ "MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli",
119
+ "jedick/DeBERTa-v3-base-mnli-fever-anli-scifact-citint",
120
+ ],
121
+ value=MODEL_NAME,
122
+ label="Model",
123
+ info="Text classification model used for claim verification",
124
+ )
125
  with gr.Accordion("Examples"):
126
  gr.Markdown("*Examples are run when clicked*"),
127
  with gr.Row():
 
157
  "label"
158
  ].tolist(),
159
  )
160
+ with gr.Accordion("Feedback"):
161
+ gr.Markdown(
162
+ "*Provide the correct label to help improve this app*<br>**NOTE:** The claim and evidence will also be saved"
163
+ ),
164
+ with gr.Row():
165
+ flag_support = gr.Button("Support")
166
+ flag_nei = gr.Button("NEI")
167
+ flag_refute = gr.Button("Refute")
168
+ gr.Markdown(
169
+ "Feedback is uploaded every 5 minutes to [AI4citations-feedback](https://huggingface.co/datasets/jedick/AI4citations-feedback)"
170
+ ),
171
+ with gr.Accordion("About this app", open=True):
172
+ gr.Markdown(
173
+ """
174
+ - <i class="fa-brands fa-github"></i> [jedick/AI4citations](https://github.com/jedick/AI4citations) (app repo)
175
+ - <i class="fa-brands fa-github"></i> [jedick/MLE-capstone-project](https://github.com/jedick/MLE-capstone-project) (project repo)
176
+ """
177
+ )
178
+ with gr.Accordion("More info", open=False):
179
+ gr.Markdown(
180
+ """
181
+ #### *Text classification*
182
+ - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [jedick/DeBERTa-v3-base-mnli-fever-anli-scifact-citint](https://huggingface.co/jedick/DeBERTa-v3-base-mnli-fever-anli-scifact-citint) (fine-tuned)
183
+ - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli](https://huggingface.co/MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli) (base)
184
+ #### *Evidence retrieval*
185
+ - <i class="fa-brands fa-github"></i> [xhluca/bm25s](https://github.com/xhluca/bm25s) (BM25S)
186
+ - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [deepset/deberta-v3-large-squad2](https://huggingface.co/deepset/deberta-v3-large-squad2) (DeBERTa)
187
+ - <img src="https://upload.wikimedia.org/wikipedia/commons/4/4d/OpenAI_Logo.svg" style="height: 1.2em; display: inline-block;"> [gpt-4o-mini-2024-07-18](https://platform.openai.com/docs/pricing) (GPT)
188
+ #### *Datasets for fine-tuning*
189
+ - <i class="fa-brands fa-github"></i> [allenai/SciFact](https://github.com/allenai/scifact) (SciFact)
190
+ - <i class="fa-brands fa-github"></i> [ScienceNLP-Lab/Citation-Integrity](https://github.com/ScienceNLP-Lab/Citation-Integrity) (CitInt)
191
+ #### *Other sources*
192
+ - <img src="https://plos.org/wp-content/uploads/2020/01/logo-color-blue.svg" style="height: 1.4em; display: inline-block;"> [Medicine](https://doi.org/10.1371/journal.pmed.0030197), <i class="fa-brands fa-wikipedia-w"></i> [CRISPR](https://en.wikipedia.org/wiki/CRISPR) (evidence retrieval examples)
193
+ - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [nyu-mll/multi_nli](https://huggingface.co/datasets/nyu-mll/multi_nli/viewer/default/train?row=37&views%5B%5D=train) (MNLI example)
194
+ - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [NoCrypt/miku](https://huggingface.co/spaces/NoCrypt/miku) (theme)
195
+ """
196
+ )
197
 
198
  # Functions
199
 
 
262
  pdf_file = f"examples/retrieval/{pdf_file}"
263
  return pdf_file, claim
264
 
265
+ @spaces.GPU(duration=30)
266
  def _retrieve_with_deberta(pdf_file, claim, top_k):
267
  """
268
  Retrieve evidence using DeBERTa
 
485
 
486
  if __name__ == "__main__":
487
  # allowed_paths is needed to upload PDFs from specific example directory
488
+ allowed_paths = [f"{os.getcwd()}/examples/retrieval"]
489
 
490
  # Setup theme without background image
491
  theme = gr.Theme.from_hub("NoCrypt/miku")
examples/Refute/log.csv CHANGED
@@ -1,3 +1,3 @@
1
  claim,evidence,label
2
- "Poirot was now back and I was sorry that he would take over what I now considered my own investigation.","Poirot, I exclaimed, with relief, and seizing him by both hands, I dragged him into the room.",MNLI
3
  "1 in 5 million in UK have abnormal PrP positivity.","OBJECTIVES To carry out a further survey of archived appendix samples to understand better the differences between existing estimates of the prevalence of subclinical infection with prions after the bovine spongiform encephalopathy epizootic and to see whether a broader birth cohort was affected, and to understand better the implications for the management of blood and blood products and for the handling of surgical instruments. DESIGN Irreversibly unlinked and anonymised large scale survey of archived appendix samples. SETTING Archived appendix samples from the pathology departments of 41 UK hospitals participating in the earlier survey, and additional hospitals in regions with lower levels of participation in that survey. SAMPLE 32,441 archived appendix samples fixed in formalin and embedded in paraffin and tested for the presence of abnormal prion protein (PrP). RESULTS Of the 32,441 appendix samples 16 were positive for abnormal PrP, indicating an overall prevalence of 493 per million population (95% confidence interval 282 to 801 per million). The prevalence in those born in 1941-60 (733 per million, 269 to 1596 per million) did not differ significantly from those born between 1961 and 1985 (412 per million, 198 to 758 per million) and was similar in both sexes and across the three broad geographical areas sampled. Genetic testing of the positive specimens for the genotype at PRNP codon 129 revealed a high proportion that were valine homozygous compared with the frequency in the normal population, and in stark contrast with confirmed clinical cases of vCJD, all of which were methionine homozygous at PRNP codon 129. CONCLUSIONS This study corroborates previous studies and suggests a high prevalence of infection with abnormal PrP, indicating vCJD carrier status in the population compared with the 177 vCJD cases to date. These findings have important implications for the management of blood and blood products and for the handling of surgical instruments.",SciFact
 
 
1
  claim,evidence,label
 
2
  "1 in 5 million in UK have abnormal PrP positivity.","OBJECTIVES To carry out a further survey of archived appendix samples to understand better the differences between existing estimates of the prevalence of subclinical infection with prions after the bovine spongiform encephalopathy epizootic and to see whether a broader birth cohort was affected, and to understand better the implications for the management of blood and blood products and for the handling of surgical instruments. DESIGN Irreversibly unlinked and anonymised large scale survey of archived appendix samples. SETTING Archived appendix samples from the pathology departments of 41 UK hospitals participating in the earlier survey, and additional hospitals in regions with lower levels of participation in that survey. SAMPLE 32,441 archived appendix samples fixed in formalin and embedded in paraffin and tested for the presence of abnormal prion protein (PrP). RESULTS Of the 32,441 appendix samples 16 were positive for abnormal PrP, indicating an overall prevalence of 493 per million population (95% confidence interval 282 to 801 per million). The prevalence in those born in 1941-60 (733 per million, 269 to 1596 per million) did not differ significantly from those born between 1961 and 1985 (412 per million, 198 to 758 per million) and was similar in both sexes and across the three broad geographical areas sampled. Genetic testing of the positive specimens for the genotype at PRNP codon 129 revealed a high proportion that were valine homozygous compared with the frequency in the normal population, and in stark contrast with confirmed clinical cases of vCJD, all of which were methionine homozygous at PRNP codon 129. CONCLUSIONS This study corroborates previous studies and suggests a high prevalence of infection with abnormal PrP, indicating vCJD carrier status in the population compared with the 177 vCJD cases to date. These findings have important implications for the management of blood and blood products and for the handling of surgical instruments.",SciFact
3
+ "Poirot was now back and I was sorry that he would take over what I now considered my own investigation.","Poirot, I exclaimed, with relief, and seizing him by both hands, I dragged him into the room.",MNLI
requirements.txt CHANGED
@@ -1,5 +1,5 @@
1
  pandas
2
- gradio
3
  torch
4
  transformers
5
  pymupdf
@@ -9,3 +9,5 @@ bm25s
9
  huggingface_hub
10
  spaces
11
  openai
 
 
 
1
  pandas
2
+ gradio==6.2.0
3
  torch
4
  transformers
5
  pymupdf
 
9
  huggingface_hub
10
  spaces
11
  openai
12
+ coverage
13
+ codecov