Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -12,7 +12,7 @@ datasets:
12
 
13
  This model is a distilled version of the [BERT base model](https://huggingface.co/bert-base-uncased). It was
14
  introduced in [this paper](https://arxiv.org/abs/1910.01108). The code for the distillation process can be found
15
- [here](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation). This model is uncased: it does
16
  not make a difference between english and English.
17
 
18
  ## Model description
@@ -187,7 +187,7 @@ The details of the masking procedure for each sentence are the following:
187
  ### Pretraining
188
 
189
  The model was trained on 8 16 GB V100 for 90 hours. See the
190
- [training code](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation) for all hyperparameters
191
  details.
192
 
193
  ## Evaluation results
 
12
 
13
  This model is a distilled version of the [BERT base model](https://huggingface.co/bert-base-uncased). It was
14
  introduced in [this paper](https://arxiv.org/abs/1910.01108). The code for the distillation process can be found
15
+ [here](https://github.com/huggingface/transformers-research-projects/tree/main/distillation). This model is uncased: it does
16
  not make a difference between english and English.
17
 
18
  ## Model description
 
187
  ### Pretraining
188
 
189
  The model was trained on 8 16 GB V100 for 90 hours. See the
190
+ [training code](https://github.com/huggingface/transformers-research-projects/tree/main/distillation) for all hyperparameters
191
  details.
192
 
193
  ## Evaluation results