sihuapeng commited on
Commit
9ae6630
·
verified ·
1 Parent(s): b286b0e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -18,7 +18,9 @@ Train Accuracy: 0.9893
18
  Validation Loss: 0.0155
19
  Validation Accuracy: 0.9702
20
  Epoch: 20
21
-
 
 
22
  ## Example
23
  ```
24
  from transformers import AutoTokenizer, AutoModelForSequenceClassification
 
18
  Validation Loss: 0.0155
19
  Validation Accuracy: 0.9702
20
  Epoch: 20
21
+ ## The dataset for training PPPSL-ESM2
22
+ The full dataset contains 11,970 protein sequences, including Cellwall (87), Cytoplasmic (6,905), CYtoplasmic Membrane (2,567), Extracellular (1085), Outer Membrane (758), and Periplasmic (568).
23
+ The highly imbalanced sample sizes across the six categories in this dataset pose a significant challenge for classification.
24
  ## Example
25
  ```
26
  from transformers import AutoTokenizer, AutoModelForSequenceClassification