zehralx commited on
Commit
2e4161c
·
verified ·
1 Parent(s): b578513

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -21
README.md CHANGED
@@ -3,29 +3,27 @@ license: apache-2.0
3
  library_name: transformers
4
  pipeline_tag: text-classification
5
  tags:
6
- - scibert
7
- - data-paper-classification
8
- - scholarly-papers
9
- - binary-classification
10
  base_model: allenai/scibert_scivocab_uncased
11
- datasets:
12
- - custom
13
  metrics:
14
- - accuracy
15
- - f1
16
  model-index:
17
- - name: scibert-data-paper
18
- results:
19
- - task:
20
- type: text-classification
21
- name: Data Paper Classification
22
- metrics:
23
- - name: Edge Case Accuracy
24
- type: accuracy
25
- value: 1.0
26
- - name: Mean Confidence
27
- type: accuracy
28
- value: 0.94
29
  ---
30
 
31
  # SciBERT Data-Paper Classifier
@@ -56,8 +54,12 @@ result = clf("MIMIC-III, a freely accessible critical care database")
56
  | Output | Binary: `data_paper` (1) / `not_data_paper` (0) |
57
  | Inference | CPU (no GPU required) |
58
 
 
 
59
  ## Training
60
 
 
 
61
  Two-phase continued fine-tuning:
62
 
63
  1. **Phase 1**: 5 epochs, learning rate 2e-5
@@ -102,4 +104,4 @@ Concatenated `title + abstract`, truncated to 512 tokens. The model works well w
102
  year={2026},
103
  url={https://huggingface.co/zehralx/scibert-data-paper}
104
  }
105
- ```
 
3
  library_name: transformers
4
  pipeline_tag: text-classification
5
  tags:
6
+ - scibert
7
+ - data-paper-classification
8
+ - scholarly-papers
9
+ - binary-classification
10
  base_model: allenai/scibert_scivocab_uncased
 
 
11
  metrics:
12
+ - accuracy
13
+ - f1
14
  model-index:
15
+ - name: scibert-data-paper
16
+ results:
17
+ - task:
18
+ type: text-classification
19
+ name: Data Paper Classification
20
+ metrics:
21
+ - name: Edge Case Accuracy
22
+ type: accuracy
23
+ value: 1
24
+ - name: Mean Confidence
25
+ type: accuracy
26
+ value: 0.94
27
  ---
28
 
29
  # SciBERT Data-Paper Classifier
 
54
  | Output | Binary: `data_paper` (1) / `not_data_paper` (0) |
55
  | Inference | CPU (no GPU required) |
56
 
57
+
58
+
59
  ## Training
60
 
61
+ [Train Data](https://www.kaggle.com/datasets/zehrakorkusuz/labeling-4k-datasets-with-gemini-flash-2-0)
62
+
63
  Two-phase continued fine-tuning:
64
 
65
  1. **Phase 1**: 5 epochs, learning rate 2e-5
 
104
  year={2026},
105
  url={https://huggingface.co/zehralx/scibert-data-paper}
106
  }
107
+ ```