JustinDuc commited on
Commit
6f92cd0
·
verified ·
1 Parent(s): 91a2e1f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -6
README.md CHANGED
@@ -56,12 +56,15 @@ It avoids the quadratic cost of full self-attention by summarizing per-speaker m
56
 
57
  ## 📈 Performance (on SODA, Masked Language Modeling)
58
 
59
- | Model Variant | Avg MLM Accuracy | Best MLM Accuracy |
60
- |---------------------------|------------------|-------------------|
61
- | BERT-base (frozen) | 33.45 | 45.89 |
62
- | + 1-layer Transformer | 68.20 | 76.69 |
63
- | + 2-layer Transformer | 71.81 | 79.54 |
64
- | **+ SAUTE (Ours)** | **72.05** | **80.40** |
 
 
 
65
 
66
  > SAUTE achieves the best accuracy using fewer parameters than multi-layer transformers.
67
 
 
56
 
57
  ## 📈 Performance (on SODA, Masked Language Modeling)
58
 
59
+
60
+ | Model | Avg MLM Acc | Best MLM Acc |
61
+ |---------------------------|-------------|--------------|
62
+ | BERT-base (frozen) | 33.45 | 45.89 |
63
+ | + 1-layer Transformer | 68.20 | 76.69 |
64
+ | + 2-layer Transformer | 71.81 | 79.54 |
65
+ | **+ 1-layer SAUTE (Ours)** | **72.05** | **80.40%** |
66
+ | + 3-layer Transformer| 73.5 | 80.84 |
67
+ | **+ 3-layer SAUTE (Ours)**| **75.65** | **85.55%**|
68
 
69
  > SAUTE achieves the best accuracy using fewer parameters than multi-layer transformers.
70