picket-cliff commited on
Commit
febc162
·
verified ·
1 Parent(s): 58604c5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -54,7 +54,9 @@ Deep learning models cannot process raw text; they require numerical tensors. We
54
  2. Special Tokens: The tokenizer automatically prepends the [CLS] (Classification) token to the start of the sequence and the [SEP] (Separator) token at the end. The final hidden state corresponding to the [CLS] token is what the model uses for the binary classification decision.
55
 
56
  3. Truncation and Padding: Transformer models require fixed-size input matrices for batch processing. Based on our EDA length distribution, we set max_length = 128.
 
57
  o Sentences longer than 128 tokens were truncated.
 
58
  o Sentences shorter than 128 tokens were padded with the [PAD] token (ID 0).
59
 
60
  4. Attention Masks: To prevent the model from performing Self-Attention on meaningless padding tokens, the tokenizer generates an attention_mask (an array of 1s for real words and 0s for padding).
@@ -78,9 +80,13 @@ Accuracy, f1 score (macro and weighted)
78
  ### Results
79
 
80
  When evaluated on a 80-20 split we obtained:
 
81
  • Accuracy: 99.10%
 
82
  • Macro Average F1-Score: 0.98
 
83
  • Weighted Average F1-Score: 0.99
 
84
  Meanwhile the dummy achieved 86.6% accuracy.
85
 
86
  #### Summary
 
54
  2. Special Tokens: The tokenizer automatically prepends the [CLS] (Classification) token to the start of the sequence and the [SEP] (Separator) token at the end. The final hidden state corresponding to the [CLS] token is what the model uses for the binary classification decision.
55
 
56
  3. Truncation and Padding: Transformer models require fixed-size input matrices for batch processing. Based on our EDA length distribution, we set max_length = 128.
57
+
58
  o Sentences longer than 128 tokens were truncated.
59
+
60
  o Sentences shorter than 128 tokens were padded with the [PAD] token (ID 0).
61
 
62
  4. Attention Masks: To prevent the model from performing Self-Attention on meaningless padding tokens, the tokenizer generates an attention_mask (an array of 1s for real words and 0s for padding).
 
80
  ### Results
81
 
82
  When evaluated on a 80-20 split we obtained:
83
+
84
  • Accuracy: 99.10%
85
+
86
  • Macro Average F1-Score: 0.98
87
+
88
  • Weighted Average F1-Score: 0.99
89
+
90
  Meanwhile the dummy achieved 86.6% accuracy.
91
 
92
  #### Summary