Addedk commited on
Commit
161fb57
·
1 Parent(s): 7e66601

Add link to report in README

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -5,8 +5,7 @@ license: apache-2.0
5
 
6
  # KB-BERT distilled base model (cased)
7
 
8
- This model is a distilled version of [KB-BERT](https://huggingface.co/KB/bert-base-swedish-cased). It was distilled using Swedish data, the 2010-2015 portion of the [Swedish Culturomics Gigaword Corpus](https://spraakbanken.gu.se/en/resources/gigaword). The code for the distillation process can be found [here](https://github.com/AddedK/swedish-mbert-distillation/blob/main/azureML/pretrain_distillation.py). This was done as part of my Master's Thesis: *Task-agnostic knowledge distillation of mBERT to Swedish*.
9
-
10
 
11
  ## Model description
12
  This is a 6-layer version of KB-BERT, having been distilled using the [LightMBERT](https://arxiv.org/abs/2103.06418) distillation method, but without freezing the embedding layer.
 
5
 
6
  # KB-BERT distilled base model (cased)
7
 
8
+ This model is a distilled version of [KB-BERT](https://huggingface.co/KB/bert-base-swedish-cased). It was distilled using Swedish data, the 2010-2015 portion of the [Swedish Culturomics Gigaword Corpus](https://spraakbanken.gu.se/en/resources/gigaword). The code for the distillation process can be found [here](https://github.com/AddedK/swedish-mbert-distillation/blob/main/azureML/pretrain_distillation.py). This was done as part of my Master's Thesis: [*Task-agnostic knowledge distillation of mBERT to Swedish*](https://kth.diva-portal.org/smash/record.jsf?aq2=%5B%5B%5D%5D&c=2&af=%5B%5D&searchType=UNDERGRADUATE&sortOrder2=title_sort_asc&language=en&pid=diva2%3A1698451&aq=%5B%5B%7B%22freeText%22%3A%22added+kina%22%7D%5D%5D&sf=all&aqe=%5B%5D&sortOrder=author_sort_asc&onlyFullText=false&noOfRows=50&dswid=-6142).
 
9
 
10
  ## Model description
11
  This is a 6-layer version of KB-BERT, having been distilled using the [LightMBERT](https://arxiv.org/abs/2103.06418) distillation method, but without freezing the embedding layer.