KoGPT2-small
| Model | Batch Size | Tokenizer | Vocab Size | Max Length | Parameter Size |
|---|---|---|---|---|---|
| GPT2 | 64 | BPE | 30,000 | 1024 | 108M |
DataSet
- AIhub - μΉλ°μ΄ν° κΈ°λ° νκ΅μ΄ λ§λμΉ λ°μ΄ν° (4.8M)
- KoWiki dump 230701 (1.4M)
Inference Example
from transformers import AutoTokenizer, GPT2LMHeadModel
text = "μ΄λμ΄ νλ€λ©΄?"
tokenizer = AutoTokenizer.from_pretrained('dataslab/GPT2-small')
model = GPT2LMHeadModel.from_pretrained('dataslab/GPT2-small')
inputs = tokenizer.encode_plus(text, return_tensors='pt', add_special_tokens=False)
outputs = model.generate(inputs['input_ids'], max_length=128,
repetition_penalty=2.0,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
bos_token_id=tokenizer.bos_token_id,
use_cache=True,
temperature = 0.5)
outputs = tokenizer.decode(outputs[0], skip_special_tokens=True)
# μΆλ ₯ κ²°κ³Ό : 'μ΄λμ΄ νλ€λ©΄ μ΄λμ νμ§ μλ κ²μ΄ μ’λ€. νμ§λ§ μ΄λ μκ°μ λ¦μΆλ κ²μ μ€νλ € 건κ°μ μ’μ§ μλ€.. νΉνλ μ₯μκ°μ μ΄λμΌλ‘ μΈν΄ νΌλ‘κ° μμ΄κ³ λ©΄μλ ₯μ΄ λ¨μ΄μ§λ©΄, νΌλ‘κ°μ΄ μ¬ν΄μ Έμ μ λ€κΈ° μ΄λ €μ΄ κ²½μ°κ° λ§λ€. μ΄λ° κ²½μ°λΌλ©΄ νμλ³΄λ€ λ λ§μ μμΌλ‘ κ³Όμμ νκ±°λ 무리ν λ€μ΄μ΄νΈλ₯Ό ν μ μλ€. λ°λΌμ μλ¨ μ‘°μ κ³Ό ν¨κ» μμ 보좩μ μ κ²½ μ¨μΌ νλ€. λν κ³Όλν μμμ΄ μ²΄μ€ κ°λμ λμμ μ£Όλ―λ‘ μ μ ν μ΄λλμ μ μ§νλ κ²λ μ€μνλ€.'
- Downloads last month
- 15
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support