studio-ousia
/

mluke-large-lite

named entity recognition

relation classification

question answering

Model card Files Files and versions

ryo0634 commited on Jun 16, 2023

Commit

30a873c

·

1 Parent(s): 9740e3d

Update README.md

Files changed (1) hide show

README.md +15 -0

README.md CHANGED Viewed

@@ -47,6 +47,21 @@ The model was initialized with the weights of XLM-RoBERTa(large) and trained usi
 This model is a lite-weight version of [studio-ousia/mluke-large](https://huggingface.co/studio-ousia/mluke-large), without Wikipedia entity embeddings but only with special entities such as `[MASK]`.
 ### Citation
 If you find mLUKE useful for your work, please cite the following paper:

 This model is a lite-weight version of [studio-ousia/mluke-large](https://huggingface.co/studio-ousia/mluke-large), without Wikipedia entity embeddings but only with special entities such as `[MASK]`.
+## Note
+When you load the model from `AutoModel.from_pretrained` with the default configuration, you will see the following warning:
+```
+Some weights of the model checkpoint at studio-ousia/mluke-base-lite were not used when initializing LukeModel: [
+'luke.encoder.layer.0.attention.self.w2e_query.weight', 'luke.encoder.layer.0.attention.self.w2e_query.bias',
+'luke.encoder.layer.0.attention.self.e2w_query.weight', 'luke.encoder.layer.0.attention.self.e2w_query.bias',
+'luke.encoder.layer.0.attention.self.e2e_query.weight', 'luke.encoder.layer.0.attention.self.e2e_query.bias',
+...]
+```
+These weights are the weights for entity-aware attention (as described in [the LUKE paper](https://arxiv.org/abs/2010.01057)).
+This is expected because `use_entity_aware_attention` is set to `false` by default, but the pretrained weights contain the weights for it in case you enable `use_entity_aware_attention` and have the weights loaded into the model.
 ### Citation
 If you find mLUKE useful for your work, please cite the following paper: