Update README.md

#3
by EunB2 - opened
Files changed (1) hide show
  1. README.md +8 -30
README.md CHANGED
@@ -1,37 +1,15 @@
1
- ---
2
- language: ko
3
- tags:
4
- - korean
5
- - klue
6
- mask_token: "[MASK]"
7
- widget:
8
- - text: 대한민국의 수도는 [MASK] 입니다.
9
- ---
10
-
11
- # KLUE RoBERTa base
12
 
13
- Pretrained RoBERTa Model on Korean Language. See [Github](https://github.com/KLUE-benchmark/KLUE) and [Paper](https://arxiv.org/abs/2105.09680) for more details.
 
14
 
15
- ## How to use
16
 
17
- _NOTE:_ Use `BertTokenizer` instead of RobertaTokenizer. (`AutoTokenizer` will load `BertTokenizer`)
 
18
 
19
  ```python
20
  from transformers import AutoModel, AutoTokenizer
21
 
22
- model = AutoModel.from_pretrained("klue/roberta-base")
23
- tokenizer = AutoTokenizer.from_pretrained("klue/roberta-base")
24
- ```
25
-
26
- ## BibTeX entry and citation info
27
-
28
- ```bibtex
29
- @misc{park2021klue,
30
- title={KLUE: Korean Language Understanding Evaluation},
31
- author={Sungjoon Park and Jihyung Moon and Sungdong Kim and Won Ik Cho and Jiyoon Han and Jangwon Park and Chisung Song and Junseong Kim and Yongsook Song and Taehwan Oh and Joohong Lee and Juhyun Oh and Sungwon Lyu and Younghoon Jeong and Inkwon Lee and Sangwoo Seo and Dongjun Lee and Hyunwoo Kim and Myeonghwa Lee and Seongbo Jang and Seungwon Do and Sunkyoung Kim and Kyungtae Lim and Jongwon Lee and Kyumin Park and Jamin Shin and Seonghyun Kim and Lucy Park and Alice Oh and Jungwoo Ha and Kyunghyun Cho},
32
- year={2021},
33
- eprint={2105.09680},
34
- archivePrefix={arXiv},
35
- primaryClass={cs.CL}
36
- }
37
- ```
 
1
+ # KL-RoBERTa
 
 
 
 
 
 
 
 
 
 
2
 
3
+ KL-RoBERTa is a Korean legal language model further pretrained for legal domain adaptation.
4
+ It is built on **klue/roberta-base** and trained on a large-scale Korean legal corpus to better capture legal terminology and long-form legal context.
5
 
6
+ For more details, see the **[GitHub](https://github.com/EunB2/KL-RoBERTa)**.
7
 
8
+ ---
9
+ ## How to Use
10
 
11
  ```python
12
  from transformers import AutoModel, AutoTokenizer
13
 
14
+ model = AutoModel.from_pretrained("EunB2/KL-RoBERTa")
15
+ tokenizer = AutoTokenizer.from_pretrained("EunB2/KL-RoBERTa")