Instructions to use lassl/roberta-ko-small with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lassl/roberta-ko-small with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="lassl/roberta-ko-small")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("lassl/roberta-ko-small") model = AutoModelForMaskedLM.from_pretrained("lassl/roberta-ko-small") - Notebooks
- Google Colab
- Kaggle
# Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("lassl/roberta-ko-small")
model = AutoModelForMaskedLM.from_pretrained("lassl/roberta-ko-small")Quick Links
LASSL roberta-ko-small
How to use
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained("lassl/roberta-ko-small")
tokenizer = AutoTokenizer.from_pretrained("lassl/roberta-ko-small")
Evaluation
Pretrained roberta-ko-small on korean language was trained by LASSL framework. Below performance was evaluated at 2021/12/15.
| nsmc | klue_nli | klue_sts | korquadv1 | klue_mrc | avg |
|---|---|---|---|---|---|
| 87.8846 | 66.3086 | 83.8353 | 83.1780 | 42.4585 | 72.7330 |
Corpora
This model was trained from 6,860,062 examples (whose have 3,512,351,744 tokens). 6,860,062 examples are extracted from below corpora. If you want to get information for training, you should see config.json.
corpora/
├── [707M] kowiki_latest.txt
├── [ 26M] modu_dialogue_v1.2.txt
├── [1.3G] modu_news_v1.1.txt
├── [9.7G] modu_news_v2.0.txt
├── [ 15M] modu_np_v1.1.txt
├── [1008M] modu_spoken_v1.2.txt
├── [6.5G] modu_written_v1.0.txt
└── [413M] petition.txt
- Downloads last month
- 167
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="lassl/roberta-ko-small")