File size: 1,745 Bytes
49b3808
 
cb7a9c0
 
49b3808
cb7a9c0
49b3808
cb7a9c0
2d27f50
cb7a9c0
 
 
 
 
8586c3a
cb7a9c0
b31de87
c593b5d
2d27f50
c593b5d
 
 
 
2d27f50
8586c3a
c593b5d
b31de87
72c091b
2d27f50
 
 
 
 
 
c4efaa5
2d27f50
b31de87
2d27f50
 
 
 
 
 
 
 
8586c3a
2d27f50
b31de87
2d27f50
7b0401e
 
4f28714
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
language:
  - en
  - ko
tags:
  - generation
license: apache-2.0
model-index:
  - name: task_1
    results:
      - task:
          type: natural-language-generation
        dataset:
          type: hellaswag
          name: hellaswag(10 shots)
        metrics:
          - type: acc_norm
            value: 27.7
  - name: task_2
    results:
      - task:
          type: natural-language-generation
        dataset:
          type: ARC
          name: ARC(25 shots)
        metrics:
          - type: acc_norm
            value: 23.8
  - name: task_3
    results:
      - task:
          type: natural-language-generation
        dataset:
          type: MMLU
          name: MMLU(5 shots)
        metrics:
          - type: acc
            value: 24.9

  - name: task_4
    results:
      - task:
          type: natural-language-generation
        dataset:
          type: TruthfulQA 
          name: TruthfulQA(0 shots)
        metrics:
          - type: mc2
            value: 46.5
---

Pretrained GPT2 with expanded n_ctx up to 2048(also with expanded embedding dimension to 1536) in Korean.
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_psyche__kogpt)

| Metric                | Value                     |
|-----------------------|---------------------------|
| Avg.                  | 24.27   |
| ARC (25-shot)         | 21.16          |
| HellaSwag (10-shot)   | 28.11    |
| MMLU (5-shot)         | 26.56         |
| TruthfulQA (0-shot)   | 42.06   |
| Winogrande (5-shot)   | 49.09   |
| GSM8K (5-shot)        | 0.0        |
| DROP (3-shot)         | 2.89         |