Commit ·
4a4fda7
1
Parent(s): f56866a
Update README.md
Browse files
README.md
CHANGED
|
@@ -12,7 +12,7 @@ license: apache-2.0
|
|
| 12 |
|
| 13 |
The model was pre-trained using T5's denoising objective on [C4](https://huggingface.co/datasets/c4), subsequently additionally pre-trained using [REALM](https://arxiv.org/pdf/2002.08909.pdf)'s salient span masking objective on [Wikipedia](https://huggingface.co/datasets/wikipedia), and finally fine-tuned on [Natural Questions (NQ)](https://huggingface.co/datasets/natural_questions).
|
| 14 |
|
| 15 |
-
**Note**: The model was fine-tuned on 90% of the train splits of [Natural Questions (NQ)](https://huggingface.co/datasets/natural_questions) for 20k steps.
|
| 16 |
|
| 17 |
Other community Checkpoints: [here](https://huggingface.co/models?search=ssm)
|
| 18 |
|
|
@@ -20,6 +20,17 @@ Paper: [How Much Knowledge Can You Pack
|
|
| 20 |
Into the Parameters of a Language Model?](https://arxiv.org/abs/1910.10683.pdf)
|
| 21 |
|
| 22 |
Authors: *Adam Roberts, Colin Raffel, Noam Shazeer*
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
## Usage
|
| 24 |
|
| 25 |
The model can be used as follows for **closed book question answering**:
|
|
@@ -34,8 +45,6 @@ input_ids = t5_tok("When was Franklin D. Roosevelt born?", return_tensors="pt").
|
|
| 34 |
gen_output = t5_qa_model.generate(input_ids)[0]
|
| 35 |
|
| 36 |
print(t5_tok.decode(gen_output, skip_special_tokens=True))
|
| 37 |
-
|
| 38 |
-
# should give "On February 13, 1904" => not correct sadly.
|
| 39 |
```
|
| 40 |
|
| 41 |
## Abstract
|
|
|
|
| 12 |
|
| 13 |
The model was pre-trained using T5's denoising objective on [C4](https://huggingface.co/datasets/c4), subsequently additionally pre-trained using [REALM](https://arxiv.org/pdf/2002.08909.pdf)'s salient span masking objective on [Wikipedia](https://huggingface.co/datasets/wikipedia), and finally fine-tuned on [Natural Questions (NQ)](https://huggingface.co/datasets/natural_questions).
|
| 14 |
|
| 15 |
+
**Note**: The model was fine-tuned on 90% of the train splits of [Natural Questions (NQ)](https://huggingface.co/datasets/natural_questions) for 20k steps and validated on the held-out 10% of the train split.
|
| 16 |
|
| 17 |
Other community Checkpoints: [here](https://huggingface.co/models?search=ssm)
|
| 18 |
|
|
|
|
| 20 |
Into the Parameters of a Language Model?](https://arxiv.org/abs/1910.10683.pdf)
|
| 21 |
|
| 22 |
Authors: *Adam Roberts, Colin Raffel, Noam Shazeer*
|
| 23 |
+
|
| 24 |
+
|
| 25 |
+
## Results on Natural Questions - Test Set
|
| 26 |
+
|
| 27 |
+
|Id | link | Exact Match |
|
| 28 |
+
|---|---|---|
|
| 29 |
+
|**T5-large**|**https://huggingface.co/google/t5-large-ssm-nqo**|**29.0**|
|
| 30 |
+
|T5-xxl|https://huggingface.co/google/t5-xxl-ssm-nqo|35.2|
|
| 31 |
+
|T5-3b|https://huggingface.co/google/t5-3b-ssm-nqo|31.7|
|
| 32 |
+
|T5-11b|https://huggingface.co/google/t5-11b-ssm-nqo|34.8|
|
| 33 |
+
|
| 34 |
## Usage
|
| 35 |
|
| 36 |
The model can be used as follows for **closed book question answering**:
|
|
|
|
| 45 |
gen_output = t5_qa_model.generate(input_ids)[0]
|
| 46 |
|
| 47 |
print(t5_tok.decode(gen_output, skip_special_tokens=True))
|
|
|
|
|
|
|
| 48 |
```
|
| 49 |
|
| 50 |
## Abstract
|