Spaces:

yancan
/

CiteScan

Runtime error

aivolcano commited on Jan 30

Commit

ec7bc6b

1 Parent(s): adeb147

update the md

Files changed (6) hide show

.gitignore CHANGED Viewed

@@ -39,7 +39,6 @@ env/
 .LSOverride
 # Project Specific Outputs
-*.txt
 *.md
 !README.md
 *_only_used_entry.bib

 .LSOverride
 # Project Specific Outputs
 *.md
 !README.md
 *_only_used_entry.bib

Untitled DELETED Viewed

	@@ -1 +0,0 @@
1	- citescan.yaml

app.py CHANGED Viewed

@@ -576,7 +576,7 @@ with gr.Blocks(title="CiteScan - Check References, Confirm Truth.", theme=gr.the
     btn_total.click(fn=filter_to_total, inputs=[result_state], outputs=[output_html])
     gr.Markdown("""
-*False positive cases* occur for CiteScan:
 1.  **Authors Mismatch**:
     - *Reason*: Different databases deal with a longer list of  authors with different strategies, like truncation.
@@ -588,7 +588,7 @@ with gr.Blocks(title="CiteScan - Check References, Confirm Truth.", theme=gr.the
 3.  **Year GAP (±1 Year)**:
     - *Reason*: Delay between preprint (arXiv) and final version publication
-    - *Action*: Verify which version you intend to cite, We recommend you to cite the version from the official press website. Lower pre-print version bib will make your submission more confidence.
 4.  **Non-academic Sources**:
     - *Reason*: Blogs, and APIs are not indexed in academic databases.

     btn_total.click(fn=filter_to_total, inputs=[result_state], outputs=[output_html])
     gr.Markdown("""
+*Case Study for False positive* in CiteScan:
 1.  **Authors Mismatch**:
     - *Reason*: Different databases deal with a longer list of  authors with different strategies, like truncation.
 3.  **Year GAP (±1 Year)**:
     - *Reason*: Delay between preprint (arXiv) and final version publication
+    - *Action*: Verify which version you intend to cite, We recommend you to cite the version from the official press website. Less number of pre-print version bibs will make your submission more convincing.
 4.  **Non-academic Sources**:
     - *Reason*: Blogs, and APIs are not indexed in academic databases.

config.yaml CHANGED Viewed

@@ -1,7 +1,3 @@
-files:
-  bib: "paper.bib"
-  output_dir: "citescan_output"
 bibliography:
   check_metadata: true
   check_usage: true
@@ -23,11 +19,6 @@ submission:
   citation_quality: true
   anonymization: true
-llm:
-  backend: "deepseeek"
-  model: "deepseek-chat"
-  api_key: "sk-d7c87a7386d94879a80282cee7bd3f45"
 output:
   quiet: false
   minimal_verified: false

 bibliography:
   check_metadata: true
   check_usage: true
   citation_quality: true
   anonymization: true
 output:
   quiet: false
   minimal_verified: false

ms_deploy.json ADDED Viewed

+{
+    "sdk_type": "gradio",
+    "sdk_version": "6.2.0",
+    "resource_configuration": "platform/2v-cpu-16g-mem",
+    "base_image": "ubuntu22.04-py311-torch2.3.1-modelscope1.31.0",
+    "environment_variables": [
+      {"name": "MODEL_NAME", "value": "deepseek-chat"},
+      {"name": "API_KEY", "value": "sk-d7c87a7386d94879a80282cee7bd3f45"}
+    ]
+  }

src/analyzers/metadata_comparator.py CHANGED Viewed

@@ -56,7 +56,7 @@ class MetadataComparator:
     # Thresholds for matching
     TITLE_THRESHOLD = 0.99
-    AUTHOR_THRESHOLD = 0.9
     def __init__(self):
         self.normalizer = TextNormalizer

     # Thresholds for matching
     TITLE_THRESHOLD = 0.99
+    AUTHOR_THRESHOLD = 0.65
     def __init__(self):
         self.normalizer = TextNormalizer