| | --- |
| | language: |
| | - multilingual |
| | - ar |
| | - bn |
| | - de |
| | - el |
| | - en |
| | - es |
| | - fi |
| | - fr |
| | - hi |
| | - id |
| | - it |
| | - ja |
| | - ko |
| | - nl |
| | - pl |
| | - pt |
| | - ru |
| | - sv |
| | - sw |
| | - te |
| | - th |
| | - tr |
| | - vi |
| | - zh |
| | thumbnail: https://github.com/studio-ousia/luke/raw/master/resources/luke_logo.png |
| | tags: |
| | - luke |
| | - named entity recognition |
| | - relation classification |
| | - question answering |
| | license: apache-2.0 |
| | --- |
| | |
| | ## mLUKE |
| |
|
| | **mLUKE** (multilingual LUKE) is a multilingual extension of LUKE. |
| |
|
| | Please check the [official repository](https://github.com/studio-ousia/luke) for |
| | more details and updates. |
| |
|
| | This is the mLUKE base model with 12 hidden layers, 768 hidden size. The total number |
| | of parameters in this model is 279M. |
| | The model was initialized with the weights of XLM-RoBERTa(base) and trained using December 2020 version of Wikipedia in 24 languages. |
| |
|
| | This model is a lite-weight version of [studio-ousia/mluke-base](https://huggingface.co/studio-ousia/mluke-base), without Wikipedia entity embeddings but only with special entities such as `[MASK]`. |
| |
|
| | ## Note |
| | When you load the model from `AutoModel.from_pretrained` with the default configuration, you will see the following warning: |
| |
|
| | ``` |
| | Some weights of the model checkpoint at studio-ousia/mluke-base-lite were not used when initializing LukeModel: [ |
| | 'luke.encoder.layer.0.attention.self.w2e_query.weight', 'luke.encoder.layer.0.attention.self.w2e_query.bias', |
| | 'luke.encoder.layer.0.attention.self.e2w_query.weight', 'luke.encoder.layer.0.attention.self.e2w_query.bias', |
| | 'luke.encoder.layer.0.attention.self.e2e_query.weight', 'luke.encoder.layer.0.attention.self.e2e_query.bias', |
| | ...] |
| | ``` |
| |
|
| | These weights are the weights for entity-aware attention (as described in [the LUKE paper](https://arxiv.org/abs/2010.01057)). |
| | This is expected because `use_entity_aware_attention` is set to `false` by default, but the pretrained weights contain the weights for it in case you enable `use_entity_aware_attention` and have the weights loaded into the model. |
| |
|
| | ### Citation |
| |
|
| | If you find mLUKE useful for your work, please cite the following paper: |
| |
|
| | ```latex |
| | @inproceedings{ri-etal-2022-mluke, |
| | title = "m{LUKE}: {T}he Power of Entity Representations in Multilingual Pretrained Language Models", |
| | author = "Ri, Ryokan and |
| | Yamada, Ikuya and |
| | Tsuruoka, Yoshimasa", |
| | booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)", |
| | year = "2022", |
| | url = "https://aclanthology.org/2022.acl-long.505", |
| | ``` |
| |
|