facebook
/

muppet-roberta-base

Model card Files Files and versions

anchit commited on Jun 28, 2021

Commit

1950bf8

·

1 Parent(s): ed6e760

Update README.md

Files changed (1) hide show

README.md +1 -8

README.md CHANGED Viewed

@@ -43,13 +43,6 @@ Note that this model is primarily aimed at being fine-tuned on tasks that use th
 to make decisions, such as sequence classification, token classification or question answering. For tasks such as text
 generation you should look at model like GPT2.
-### Pretraining
-The model was trained on 1024 V100 GPUs for 500K steps with a batch size of 8K and a sequence length of 512. The
-optimizer used is Adam with a learning rate of 6e-4, \\\\(\\beta_{1} = 0.9\\\\), \\\\(\\beta_{2} = 0.98\\\\) and
-\\\\(\\epsilon = 1e-6\\\\), a weight decay of 0.01, learning rate warmup for 24,000 steps and linear decay of the learning
-rate after.
 ## Evaluation results
 When fine-tuned on downstream tasks, this model achieves the following results:
@@ -86,5 +79,5 @@ Glue test results:
 ```
 <a href="https://huggingface.co/facebook/muppet-roberta-base">
-\t<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
 </a>

 to make decisions, such as sequence classification, token classification or question answering. For tasks such as text
 generation you should look at model like GPT2.
 ## Evaluation results
 When fine-tuned on downstream tasks, this model achieves the following results:
 ```
 <a href="https://huggingface.co/facebook/muppet-roberta-base">
+\\t<img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
 </a>