frederic-sadrieh commited on
Commit
fd13d02
·
verified ·
1 Parent(s): 70944b1

Minor changes

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -38,5 +38,5 @@ With MosaicBERT and FlashAttention 2, we can increase the throughput from 190,00
38
  ## Model variations
39
  For the creation of BERTchen we tested different datasets and training setups. Two notable variants are:
40
 
41
- - [`BERTchen-v0.1-C4`](https://huggingface.co/frederic-sadrieh/BERTchen-v0.1-C4) Same pre-training just on the [C4](https://huggingface.co/datasets/allenai/c4) dataset.
42
- - [`hybrid_BERTchen-v0.1`](https://huggingface.co/frederic-sadrieh/hybrid_BERTchen-v0.1) Pre-trained on [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX) with own hybrid sequence length changing approach (For more information see model card or paper)
 
38
  ## Model variations
39
  For the creation of BERTchen we tested different datasets and training setups. Two notable variants are:
40
 
41
+ - [`BERTchen-v0.1-C4`](https://huggingface.co/frederic-sadrieh/BERTchen-v0.1-C4) Same pretraining setup and hyperparameters just on the [C4](https://huggingface.co/datasets/allenai/c4) dataset.
42
+ - [`hybrid_BERTchen-v0.1`](https://huggingface.co/frederic-sadrieh/hybrid_BERTchen-v0.1) Pretrained on [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX) with own hybrid sequence length changing approach (For more information see model card or paper)