Spaces:

jedick
/

AI4citations

Running on Zero

App Files Files Community

jedick commited on Jan 6

Commit

6ecddd5

1 Parent(s): c313a1f

Adjust app layout

Browse files

Files changed (3) hide show

app.py +54 -50
examples/Refute/log.csv +1 -1
requirements.txt +3 -1

app.py CHANGED Viewed

@@ -85,7 +85,7 @@ with gr.Blocks() as demo:
                                 choices=["BM25S", "DeBERTa", "GPT"],
                                 value="BM25S",
                                 label="Retrieval Method",
-                                info="Keyword search (BM25S) or AI (DeBERTa, GPT)",
                             )
                         top_k = gr.Slider(
                             1,
@@ -106,42 +106,22 @@ with gr.Blocks() as demo:
                         completion_tokens = gr.Number(
                             label="Completion tokens", visible=False
                         )
-            with gr.Accordion("More info", open=True):
-                gr.Markdown(
-                    """
-                #### *MLE capstone project*
-                - <i class="fa-brands fa-github"></i> [jedick/MLE-capstone-project](https://github.com/jedick/MLE-capstone-project) (project repo)
-                - <i class="fa-brands fa-github"></i> [jedick/AI4citations](https://github.com/jedick/AI4citations) (app repo)
-                #### *Text classification*
-                - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [jedick/DeBERTa-v3-base-mnli-fever-anli-scifact-citint](https://huggingface.co/jedick/DeBERTa-v3-base-mnli-fever-anli-scifact-citint) (fine-tuned)
-                - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli](https://huggingface.co/MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli) (base)
-                #### *Evidence retrieval*
-                - <i class="fa-brands fa-github"></i> [xhluca/bm25s](https://github.com/xhluca/bm25s) (BM25S)
-                - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [deepset/deberta-v3-large-squad2](https://huggingface.co/deepset/deberta-v3-large-squad2) (DeBERTa)
-                - <img src="https://upload.wikimedia.org/wikipedia/commons/4/4d/OpenAI_Logo.svg" style="height: 1.2em; display: inline-block;"> [gpt-4o-mini-2024-07-18](https://platform.openai.com/docs/pricing) (GPT)
-                #### *Datasets for fine-tuning*
-                - <i class="fa-brands fa-github"></i> [allenai/SciFact](https://github.com/allenai/scifact) (SciFact)
-                - <i class="fa-brands fa-github"></i> [ScienceNLP-Lab/Citation-Integrity](https://github.com/ScienceNLP-Lab/Citation-Integrity) (CitInt)
-                #### *Other sources*
-                - <img src="https://plos.org/wp-content/uploads/2020/01/logo-color-blue.svg" style="height: 1.4em; display: inline-block;"> [Medicine](https://doi.org/10.1371/journal.pmed.0030197), <i class="fa-brands fa-wikipedia-w"></i> [CRISPR](https://en.wikipedia.org/wiki/CRISPR) (evidence retrieval examples)
-                - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [nyu-mll/multi_nli](https://huggingface.co/datasets/nyu-mll/multi_nli/viewer/default/train?row=37&views%5B%5D=train) (MNLI example)
-                - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [NoCrypt/miku](https://huggingface.co/spaces/NoCrypt/miku) (theme)
-                """
-                )
         with gr.Column(scale=2):
-            prediction = gr.Label(label="Prediction")
-            with gr.Accordion("Feedback"):
-                gr.Markdown(
-                    "*Provide the correct label to help improve this app*<br>**NOTE:** The claim and evidence will also be saved"
-                ),
-                with gr.Row():
-                    flag_support = gr.Button("Support")
-                    flag_nei = gr.Button("NEI")
-                    flag_refute = gr.Button("Refute")
-                gr.Markdown(
-                    "Feedback is uploaded every 5 minutes to [AI4citations-feedback](https://huggingface.co/datasets/jedick/AI4citations-feedback)"
-                ),
             with gr.Accordion("Examples"):
                 gr.Markdown("*Examples are run when clicked*"),
                 with gr.Row():
@@ -177,19 +157,43 @@ with gr.Blocks() as demo:
                         "label"
                     ].tolist(),
                 )
-            # Create dropdown menu to select the model
-            model = gr.Dropdown(
-                choices=[
-                    # TODO: For bert-base-uncased, how can we set num_labels = 2 in HF pipeline?
-                    # (num_labels is available in AutoModelForSequenceClassification.from_pretrained)
-                    # "bert-base-uncased",
-                    "MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli",
-                    "jedick/DeBERTa-v3-base-mnli-fever-anli-scifact-citint",
-                ],
-                value=MODEL_NAME,
-                label="Model",
-                info="Text classification model used for claim verification",
-            )
     # Functions
@@ -258,7 +262,7 @@ with gr.Blocks() as demo:
             pdf_file = f"examples/retrieval/{pdf_file}"
         return pdf_file, claim
-    @spaces.GPU()
     def _retrieve_with_deberta(pdf_file, claim, top_k):
         """
         Retrieve evidence using DeBERTa
@@ -481,7 +485,7 @@ with gr.Blocks() as demo:
 if __name__ == "__main__":
     # allowed_paths is needed to upload PDFs from specific example directory
-    allowed_paths=[f"{os.getcwd()}/examples/retrieval"]
     # Setup theme without background image
     theme = gr.Theme.from_hub("NoCrypt/miku")

                                 choices=["BM25S", "DeBERTa", "GPT"],
                                 value="BM25S",
                                 label="Retrieval Method",
+                                info="Lexical (BM25S) or semantic (DeBERTa, GPT)",
                             )
                         top_k = gr.Slider(
                             1,
                         completion_tokens = gr.Number(
                             label="Completion tokens", visible=False
                         )
+                    prediction = gr.Label(label="Prediction")
         with gr.Column(scale=2):
+            # Create dropdown menu to select the model
+            model = gr.Dropdown(
+                choices=[
+                    # TODO: For bert-base-uncased, how can we set num_labels = 2 in HF pipeline?
+                    # (num_labels is available in AutoModelForSequenceClassification.from_pretrained)
+                    # "bert-base-uncased",
+                    "MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli",
+                    "jedick/DeBERTa-v3-base-mnli-fever-anli-scifact-citint",
+                ],
+                value=MODEL_NAME,
+                label="Model",
+                info="Text classification model used for claim verification",
+            )
             with gr.Accordion("Examples"):
                 gr.Markdown("*Examples are run when clicked*"),
                 with gr.Row():
                         "label"
                     ].tolist(),
                 )
+            with gr.Accordion("Feedback"):
+                gr.Markdown(
+                    "*Provide the correct label to help improve this app*<br>**NOTE:** The claim and evidence will also be saved"
+                ),
+                with gr.Row():
+                    flag_support = gr.Button("Support")
+                    flag_nei = gr.Button("NEI")
+                    flag_refute = gr.Button("Refute")
+                gr.Markdown(
+                    "Feedback is uploaded every 5 minutes to [AI4citations-feedback](https://huggingface.co/datasets/jedick/AI4citations-feedback)"
+                ),
+            with gr.Accordion("About this app", open=True):
+                gr.Markdown(
+                    """
+                - <i class="fa-brands fa-github"></i> [jedick/AI4citations](https://github.com/jedick/AI4citations) (app repo)
+                - <i class="fa-brands fa-github"></i> [jedick/MLE-capstone-project](https://github.com/jedick/MLE-capstone-project) (project repo)
+                """
+                )
+            with gr.Accordion("More info", open=False):
+                gr.Markdown(
+                    """
+                #### *Text classification*
+                - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [jedick/DeBERTa-v3-base-mnli-fever-anli-scifact-citint](https://huggingface.co/jedick/DeBERTa-v3-base-mnli-fever-anli-scifact-citint) (fine-tuned)
+                - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli](https://huggingface.co/MoritzLaurer/DeBERTa-v3-base-mnli-fever-anli) (base)
+                #### *Evidence retrieval*
+                - <i class="fa-brands fa-github"></i> [xhluca/bm25s](https://github.com/xhluca/bm25s) (BM25S)
+                - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [deepset/deberta-v3-large-squad2](https://huggingface.co/deepset/deberta-v3-large-squad2) (DeBERTa)
+                - <img src="https://upload.wikimedia.org/wikipedia/commons/4/4d/OpenAI_Logo.svg" style="height: 1.2em; display: inline-block;"> [gpt-4o-mini-2024-07-18](https://platform.openai.com/docs/pricing) (GPT)
+                #### *Datasets for fine-tuning*
+                - <i class="fa-brands fa-github"></i> [allenai/SciFact](https://github.com/allenai/scifact) (SciFact)
+                - <i class="fa-brands fa-github"></i> [ScienceNLP-Lab/Citation-Integrity](https://github.com/ScienceNLP-Lab/Citation-Integrity) (CitInt)
+                #### *Other sources*
+                - <img src="https://plos.org/wp-content/uploads/2020/01/logo-color-blue.svg" style="height: 1.4em; display: inline-block;"> [Medicine](https://doi.org/10.1371/journal.pmed.0030197), <i class="fa-brands fa-wikipedia-w"></i> [CRISPR](https://en.wikipedia.org/wiki/CRISPR) (evidence retrieval examples)
+                - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [nyu-mll/multi_nli](https://huggingface.co/datasets/nyu-mll/multi_nli/viewer/default/train?row=37&views%5B%5D=train) (MNLI example)
+                - <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.svg" style="height: 1.2em; display: inline-block;"> [NoCrypt/miku](https://huggingface.co/spaces/NoCrypt/miku) (theme)
+                """
+                )
     # Functions
             pdf_file = f"examples/retrieval/{pdf_file}"
         return pdf_file, claim
+    @spaces.GPU(duration=30)
     def _retrieve_with_deberta(pdf_file, claim, top_k):
         """
         Retrieve evidence using DeBERTa
 if __name__ == "__main__":
     # allowed_paths is needed to upload PDFs from specific example directory
+    allowed_paths = [f"{os.getcwd()}/examples/retrieval"]
     # Setup theme without background image
     theme = gr.Theme.from_hub("NoCrypt/miku")

examples/Refute/log.csv CHANGED Viewed

@@ -1,3 +1,3 @@
 claim,evidence,label
-"Poirot was now back and I was sorry that he would take over what I now considered my own investigation.","Poirot, I exclaimed, with relief, and seizing him by both hands, I dragged him into the room.",MNLI
 "1 in 5 million in UK have abnormal PrP positivity.","OBJECTIVES To carry out a further survey of archived appendix samples to understand better the differences between existing estimates of the prevalence of subclinical infection with prions after the bovine spongiform encephalopathy epizootic and to see whether a broader birth cohort was affected, and to understand better the implications for the management of blood and blood products and for the handling of surgical instruments. DESIGN Irreversibly unlinked and anonymised large scale survey of archived appendix samples. SETTING Archived appendix samples from the pathology departments of 41 UK hospitals participating in the earlier survey, and additional hospitals in regions with lower levels of participation in that survey. SAMPLE 32,441 archived appendix samples fixed in formalin and embedded in paraffin and tested for the presence of abnormal prion protein (PrP). RESULTS Of the 32,441 appendix samples 16 were positive for abnormal PrP, indicating an overall prevalence of 493 per million population (95% confidence interval 282 to 801 per million). The prevalence in those born in 1941-60 (733 per million, 269 to 1596 per million) did not differ significantly from those born between 1961 and 1985 (412 per million, 198 to 758 per million) and was similar in both sexes and across the three broad geographical areas sampled. Genetic testing of the positive specimens for the genotype at PRNP codon 129 revealed a high proportion that were valine homozygous compared with the frequency in the normal population, and in stark contrast with confirmed clinical cases of vCJD, all of which were methionine homozygous at PRNP codon 129. CONCLUSIONS This study corroborates previous studies and suggests a high prevalence of infection with abnormal PrP, indicating vCJD carrier status in the population compared with the 177 vCJD cases to date. These findings have important implications for the management of blood and blood products and for the handling of surgical instruments.",SciFact

 claim,evidence,label
 "1 in 5 million in UK have abnormal PrP positivity.","OBJECTIVES To carry out a further survey of archived appendix samples to understand better the differences between existing estimates of the prevalence of subclinical infection with prions after the bovine spongiform encephalopathy epizootic and to see whether a broader birth cohort was affected, and to understand better the implications for the management of blood and blood products and for the handling of surgical instruments. DESIGN Irreversibly unlinked and anonymised large scale survey of archived appendix samples. SETTING Archived appendix samples from the pathology departments of 41 UK hospitals participating in the earlier survey, and additional hospitals in regions with lower levels of participation in that survey. SAMPLE 32,441 archived appendix samples fixed in formalin and embedded in paraffin and tested for the presence of abnormal prion protein (PrP). RESULTS Of the 32,441 appendix samples 16 were positive for abnormal PrP, indicating an overall prevalence of 493 per million population (95% confidence interval 282 to 801 per million). The prevalence in those born in 1941-60 (733 per million, 269 to 1596 per million) did not differ significantly from those born between 1961 and 1985 (412 per million, 198 to 758 per million) and was similar in both sexes and across the three broad geographical areas sampled. Genetic testing of the positive specimens for the genotype at PRNP codon 129 revealed a high proportion that were valine homozygous compared with the frequency in the normal population, and in stark contrast with confirmed clinical cases of vCJD, all of which were methionine homozygous at PRNP codon 129. CONCLUSIONS This study corroborates previous studies and suggests a high prevalence of infection with abnormal PrP, indicating vCJD carrier status in the population compared with the 177 vCJD cases to date. These findings have important implications for the management of blood and blood products and for the handling of surgical instruments.",SciFact
+"Poirot was now back and I was sorry that he would take over what I now considered my own investigation.","Poirot, I exclaimed, with relief, and seizing him by both hands, I dragged him into the room.",MNLI

requirements.txt CHANGED Viewed

@@ -1,5 +1,5 @@
 pandas
-gradio
 torch
 transformers
 pymupdf
@@ -9,3 +9,5 @@ bm25s
 huggingface_hub
 spaces
 openai

 pandas
+gradio==6.2.0
 torch
 transformers
 pymupdf
 huggingface_hub
 spaces
 openai
+coverage
+codecov