Commit ·
8bfdbc2
1
Parent(s): 8effa68
Update README.md
Browse files
README.md
CHANGED
|
@@ -12,12 +12,6 @@ Training is leveraged using four datasets: [VisualDialogues](https://visualdialo
|
|
| 12 |
<img alt="Baseline model architecture" src="./baseline.png" width="100%">
|
| 13 |
</p>
|
| 14 |
|
| 15 |
-
The methodology of using adapter can be summarized as follows:
|
| 16 |
-
|
| 17 |
-
1. The data $y$ from other modalities is processed by a pretrained encoder, yielding an embedding $v_\phi(y) \in \mathbb{R}^{m}$.
|
| 18 |
-
2. This embedding is subsequently transformed into $k$ vectors (weights provided here correspond to $k=32$) within the language model’s vector space of dimensionality $d$. This transformation is facilitated by a linear mapping, articulated as $v_\phi(y)^\top \mathbf{W}$, where $\mathbf{W} \in \mathbb{R}^{m \times kd}$.
|
| 19 |
-
3. Tokens demarcating the beginning and conclusion of data pertaining to a specific modality incorporate trainable embeddings with dimensionality $d$. The inclusion of these tokens in the model’s dictionary is implicit, thereby negating the necessity for explicit addition to tokenizer.
|
| 20 |
-
|
| 21 |
## Training
|
| 22 |
To reproduce training, please run [notebook](./baseline.ipynb) after installing requirements:
|
| 23 |
```
|
|
|
|
| 12 |
<img alt="Baseline model architecture" src="./baseline.png" width="100%">
|
| 13 |
</p>
|
| 14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
## Training
|
| 16 |
To reproduce training, please run [notebook](./baseline.ipynb) after installing requirements:
|
| 17 |
```
|