zehralx
/

scibert-data-paper

Text Classification

data-paper-classification

scholarly-papers

binary-classification

Eval Results (legacy)

text-embeddings-inference

Model card Files Files and versions

zehralx commited on Feb 12

Commit

2e4161c

·

verified ·

1 Parent(s): b578513

Update README.md

Files changed (1) hide show

README.md +23 -21

README.md CHANGED Viewed

@@ -3,29 +3,27 @@ license: apache-2.0
 library_name: transformers
 pipeline_tag: text-classification
 tags:
-  - scibert
-  - data-paper-classification
-  - scholarly-papers
-  - binary-classification
 base_model: allenai/scibert_scivocab_uncased
-datasets:
-  - custom
 metrics:
-  - accuracy
-  - f1
 model-index:
-  - name: scibert-data-paper
-    results:
-      - task:
-          type: text-classification
-          name: Data Paper Classification
-        metrics:
-          - name: Edge Case Accuracy
-            type: accuracy
-            value: 1.0
-          - name: Mean Confidence
-            type: accuracy
-            value: 0.94
 ---
 # SciBERT Data-Paper Classifier
@@ -56,8 +54,12 @@ result = clf("MIMIC-III, a freely accessible critical care database")
 | Output | Binary: `data_paper` (1) / `not_data_paper` (0) |
 | Inference | CPU (no GPU required) |
 ## Training
 Two-phase continued fine-tuning:
 1. **Phase 1**: 5 epochs, learning rate 2e-5
@@ -102,4 +104,4 @@ Concatenated `title + abstract`, truncated to 512 tokens. The model works well w
   year={2026},
   url={https://huggingface.co/zehralx/scibert-data-paper}
 }
-```

 library_name: transformers
 pipeline_tag: text-classification
 tags:
+- scibert
+- data-paper-classification
+- scholarly-papers
+- binary-classification
 base_model: allenai/scibert_scivocab_uncased
 metrics:
+- accuracy
+- f1
 model-index:
+- name: scibert-data-paper
+  results:
+  - task:
+      type: text-classification
+      name: Data Paper Classification
+    metrics:
+    - name: Edge Case Accuracy
+      type: accuracy
+      value: 1
+    - name: Mean Confidence
+      type: accuracy
+      value: 0.94
 ---
 # SciBERT Data-Paper Classifier
 | Output | Binary: `data_paper` (1) / `not_data_paper` (0) |
 | Inference | CPU (no GPU required) |
 ## Training
+[Train Data](https://www.kaggle.com/datasets/zehrakorkusuz/labeling-4k-datasets-with-gemini-flash-2-0)
 Two-phase continued fine-tuning:
 1. **Phase 1**: 5 epochs, learning rate 2e-5
   year={2026},
   url={https://huggingface.co/zehralx/scibert-data-paper}
 }
+```