How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("fill-mask", model="LeverageX/scibert-wechsel-korean")
# Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM

tokenizer = AutoTokenizer.from_pretrained("LeverageX/scibert-wechsel-korean")
model = AutoModelForMaskedLM.from_pretrained("LeverageX/scibert-wechsel-korean")
Quick Links

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

scibert-wechsel-korean

Scibert(๐Ÿ‡บ๐Ÿ‡ธ) converted into Korean(๐Ÿ‡ฐ๐Ÿ‡ท) using WECHSEL technique.

Description

  • SciBERT is trained on papers from the corpus of semanticscholar.org. Corpus size is 1.14M papers, 3.1B tokens.
  • Wechsel is converting embedding layer's subword tokens from source language to target language.
  • SciBERT trained with English language is converted into Korean langauge using Wechsel technique.
  • Korean tokenizer is selected with KLUE PLMs' tokenizers due to its similar vocab size(32000) and performance.

Reference

Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support