File size: 598 Bytes
4633b94
790a0d5
4633b94
790a0d5
4633b94
 
 
 
790a0d5
5c3e715
5a2c5f8
 
 
 
 
 
 
 
 
9d1045c
5a2c5f8
 
 
 
 
1315132
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# t5-small-wikitext

t5-small trained on [wikitext/wikitest-103-raw-v1](wikitext/wikitest-103-raw-v1) over 50k steps (around 2 hours of training) following [T5 paper](https://arxiv.org/pdf/1910.10683.pdf) training procedure.

* batch_size: 32
* max_seq_length: 128
* optim: Adafactor
* sheduler: inverse square root (10k warm-up steps)

---
language: 
  - "List of ISO 639-1 code for your language"
  - lang1
  - lang2
thumbnail: "url to a thumbnail used in social sharing"
tags:
- tag1
- tag2
license: "any valid license identifier"
datasets:
- dataset1
- dataset2
metrics:
- metric1
- metric2
---