| # tinyroberta-mrqa | |
| This is the *distilled* version of the [VMware/roberta-large-mrqa](https://huggingface.co/VMware/roberta-large-mrqa) model. This model has a comparable prediction quality to the base model and runs twice as fast. | |
| ## Overview | |
| **Language model:** tinyroberta-mrqa | |
| **Language:** English | |
| **Downstream-task:** Extractive QA | |
| **Training data:** MRQA | |
| **Eval data:** MRQA | |
| ## Hyperparameters | |
| ### Distillation Hyperparameters | |
| ``` | |
| batch_size = 96 | |
| n_epochs = 4 | |
| base_LM_model = "deepset/tinyroberta-squad2-step1" | |
| max_seq_len = 384 | |
| learning_rate = 3e-5 | |
| lr_schedule = LinearWarmup | |
| warmup_proportion = 0.2 | |
| doc_stride = 128 | |
| max_query_length = 64 | |
| distillation_loss_weight = 0.75 | |
| temperature = 1.5 | |
| teacher = "VMware/roberta-large-mrqa" | |
| ``` | |
| ### Finetunning Hyperparameters | |
| We have finetuned on the MRQA training set. | |
| ``` | |
| learning_rate=1e-5, | |
| num_train_epochs=3, | |
| weight_decay=0.01, | |
| per_device_train_batch_size=16, | |
| n_gpus = 3 | |
| ``` | |
| ## Distillation | |
| This model is inspired by deepset/tinyroberta-squad2. | |
| We start with a base checkpoint of [deepset/roberta-base-squad2](https://huggingface.co/deepset/roberta-base-squad2) and perform further task prediction layer distillation on [VMware/roberta-large-mrqa](https://huggingface.co/VMware/roberta-large-mrqa). | |
| We then fine-tune it on MRQA. | |
| ## Usage | |
| ### In Transformers | |
| ```python | |
| from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline | |
| model_name = "VMware/tinyroberta-mrqa" | |
| # a) Get predictions | |
| nlp = pipeline('question-answering', model=model_name, tokenizer=model_name) | |
| QA_input = { | |
| 'question': '', | |
| 'context': '' | |
| } | |
| res = nlp(QA_input) | |
| # b) Load model & tokenizer | |
| model = AutoModelForQuestionAnswering.from_pretrained(model_name) | |
| tokenizer = AutoTokenizer.from_pretrained(model_name) | |
| ``` | |
| ## Performance | |
| We have Evaluated the model on the MRQA dev set and test set using SQUAD metrics. | |
| ``` | |
| eval exact match: 69.2 | |
| eval f1 score: 79.6 | |
| test exact match: 52.8 | |
| test f1 score: 63.4 | |
| ``` | |