metadata
language:
- en
- ko
tags:
- generation
license: apache-2.0
model-index:
- name: task_1
results:
- task:
type: natural-language-generation
dataset:
type: hellaswag
name: hellaswag(10 shots)
metrics:
- type: acc_norm
value: 27.7
- name: task_2
results:
- task:
type: natural-language-generation
dataset:
type: ARC
name: ARC(25 shots)
metrics:
- type: acc_norm
value: 23.8
- name: task_3
results:
- task:
type: natural-language-generation
dataset:
type: MMLU
name: MMLU(5 shots)
metrics:
- type: acc
value: 24.9
- name: task_4
results:
- task:
type: natural-language-generation
dataset:
type: TruthfulQA
name: TruthfulQA(0 shots)
metrics:
- type: mc2
value: 46.5
Pretrained GPT2 with expanded n_ctx up to 2048(also with expanded embedding dimension to 1536) in Korean.
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
| Metric | Value |
|---|---|
| Avg. | 24.27 |
| ARC (25-shot) | 21.16 |
| HellaSwag (10-shot) | 28.11 |
| MMLU (5-shot) | 26.56 |
| TruthfulQA (0-shot) | 42.06 |
| Winogrande (5-shot) | 49.09 |
| GSM8K (5-shot) | 0.0 |
| DROP (3-shot) | 2.89 |