Commit
·
632cc84
1
Parent(s):
c5f1fae
Model card formatting.
Browse files
README.md
CHANGED
|
@@ -19,6 +19,10 @@ tags:
|
|
| 19 |
- spoken language understanding
|
| 20 |
---
|
| 21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
SEGUE is a pre-training approach for sequence-level spoken language understanding (SLU) tasks.
|
| 23 |
We use knowledge distillation on a parallel speech-text corpus (e.g. an ASR corpus) to distil
|
| 24 |
language understanding knowledge from a textual sentence embedder to a pre-trained speech encoder.
|
|
@@ -26,11 +30,6 @@ SEGUE applied to Wav2Vec 2.0 improves performance for many SLU tasks, including
|
|
| 26 |
intent classification / slot-filling, spoken sentiment analysis, and spoken emotion classification.
|
| 27 |
These improvements were observed in both fine-tuned and non-fine-tuned settings, as well as few-shot settings.
|
| 28 |
|
| 29 |
-
## Model Details
|
| 30 |
-
|
| 31 |
-
- **Repository:** https://github.com/declare-lab/segue
|
| 32 |
-
- **Paper:**
|
| 33 |
-
|
| 34 |
## How to Get Started with the Model
|
| 35 |
|
| 36 |
To use this model checkpoint, you need to use the model classes on [our GitHub repository](https://github.com/declare-lab/segue).
|
|
@@ -81,12 +80,6 @@ Please refer to the paper for full results.
|
|
| 81 |
|w2v 2.0|54.0|
|
| 82 |
|SEGUE|**77.9**|
|
| 83 |
|
| 84 |
-
#### Few-shot
|
| 85 |
-
|
| 86 |
-
Plots of k-shot per class accuracy against k:
|
| 87 |
-
|
| 88 |
-
<img src='readme/minds-14.svg' style='width: 50%;'>
|
| 89 |
-
|
| 90 |
### MELD (sentiment and emotion classification)
|
| 91 |
|
| 92 |
#### Fine-tuning
|
|
@@ -106,13 +99,6 @@ Plots of k-shot per class accuracy against k:
|
|
| 106 |
|w2v 2.0|45.0±0.7|34.3±1.2|
|
| 107 |
|SEGUE|**45.8±0.1**|**35.7±0.3**|
|
| 108 |
|
| 109 |
-
#### Few-shot
|
| 110 |
-
|
| 111 |
-
Plots of MELD k-shot per class F1 score against k - sentiment and emotion respectively:
|
| 112 |
-
|
| 113 |
-
<img src='readme/meld-sent.svg' style='display: inline; width: 40%;'>
|
| 114 |
-
<img src='readme/meld-emo.svg' style='display: inline; width: 40%;'>
|
| 115 |
-
|
| 116 |
## Limitations
|
| 117 |
|
| 118 |
In the paper, we hypothesized that SEGUE may perform worse on tasks that rely less on
|
|
|
|
| 19 |
- spoken language understanding
|
| 20 |
---
|
| 21 |
|
| 22 |
+
**Repository:** https://github.com/declare-lab/segue
|
| 23 |
+
|
| 24 |
+
**Paper:**
|
| 25 |
+
|
| 26 |
SEGUE is a pre-training approach for sequence-level spoken language understanding (SLU) tasks.
|
| 27 |
We use knowledge distillation on a parallel speech-text corpus (e.g. an ASR corpus) to distil
|
| 28 |
language understanding knowledge from a textual sentence embedder to a pre-trained speech encoder.
|
|
|
|
| 30 |
intent classification / slot-filling, spoken sentiment analysis, and spoken emotion classification.
|
| 31 |
These improvements were observed in both fine-tuned and non-fine-tuned settings, as well as few-shot settings.
|
| 32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
## How to Get Started with the Model
|
| 34 |
|
| 35 |
To use this model checkpoint, you need to use the model classes on [our GitHub repository](https://github.com/declare-lab/segue).
|
|
|
|
| 80 |
|w2v 2.0|54.0|
|
| 81 |
|SEGUE|**77.9**|
|
| 82 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 83 |
### MELD (sentiment and emotion classification)
|
| 84 |
|
| 85 |
#### Fine-tuning
|
|
|
|
| 99 |
|w2v 2.0|45.0±0.7|34.3±1.2|
|
| 100 |
|SEGUE|**45.8±0.1**|**35.7±0.3**|
|
| 101 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 102 |
## Limitations
|
| 103 |
|
| 104 |
In the paper, we hypothesized that SEGUE may perform worse on tasks that rely less on
|