Update README.md
Browse files
README.md
CHANGED
|
@@ -47,6 +47,21 @@ The model was initialized with the weights of XLM-RoBERTa(large) and trained usi
|
|
| 47 |
|
| 48 |
This model is a lite-weight version of [studio-ousia/mluke-large](https://huggingface.co/studio-ousia/mluke-large), without Wikipedia entity embeddings but only with special entities such as `[MASK]`.
|
| 49 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
### Citation
|
| 51 |
|
| 52 |
If you find mLUKE useful for your work, please cite the following paper:
|
|
|
|
| 47 |
|
| 48 |
This model is a lite-weight version of [studio-ousia/mluke-large](https://huggingface.co/studio-ousia/mluke-large), without Wikipedia entity embeddings but only with special entities such as `[MASK]`.
|
| 49 |
|
| 50 |
+
## Note
|
| 51 |
+
When you load the model from `AutoModel.from_pretrained` with the default configuration, you will see the following warning:
|
| 52 |
+
|
| 53 |
+
```
|
| 54 |
+
Some weights of the model checkpoint at studio-ousia/mluke-base-lite were not used when initializing LukeModel: [
|
| 55 |
+
'luke.encoder.layer.0.attention.self.w2e_query.weight', 'luke.encoder.layer.0.attention.self.w2e_query.bias',
|
| 56 |
+
'luke.encoder.layer.0.attention.self.e2w_query.weight', 'luke.encoder.layer.0.attention.self.e2w_query.bias',
|
| 57 |
+
'luke.encoder.layer.0.attention.self.e2e_query.weight', 'luke.encoder.layer.0.attention.self.e2e_query.bias',
|
| 58 |
+
...]
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
These weights are the weights for entity-aware attention (as described in [the LUKE paper](https://arxiv.org/abs/2010.01057)).
|
| 62 |
+
This is expected because `use_entity_aware_attention` is set to `false` by default, but the pretrained weights contain the weights for it in case you enable `use_entity_aware_attention` and have the weights loaded into the model.
|
| 63 |
+
|
| 64 |
+
|
| 65 |
### Citation
|
| 66 |
|
| 67 |
If you find mLUKE useful for your work, please cite the following paper:
|