Update Log

  • 2024.07.01: Released Solar-Ko-Recovery & Uploaded Benchmark scores
  • 2024.05.16: Preview Released Solar-Ko-Recovery

Solar-Ko-Recovery-11B ๐ŸŒŸโค๏ธโ€๐Ÿฉน

Solar-Ko-Recovery-11B aimed to recover Solar's capability on Korean with re-arrange of Embeddings and LM head, featuring an expanded vocabulary and the inclusion of a Korean+English corpus for enhanced representation.

Model Details

Model Developers: Junbum Lee (Beomi)

Variations: Solar-Ko-Recovery is available with one parameter sizes โ€” 11B(10.99B๐Ÿคฃ).

Input: The model accepts only text input.

Output: The model produces text output exclusively.

Model Architecture:

Solar-Ko-Recovery is an auto-regressive language model that leverages an optimized transformer architecture derived from Llama-2.

Training Data Parameters Content Length GQA Tokens Learning Rate
Solar-Ko-Recovery A curated mix of Korean+English Corpora 11B(10.99B) 4k O >100B* 5e-5

NOTE: 2-step training processed

  1. Only Embedding layer and LM Head layer are trained
  2. Full params trained

Vocab Expansion

Vocab expansion is conducted on edited upstage/solar-1-mini-tokenizer, which is superset of Solar tokenizer.

Model Name Vocabulary Size Description
Original Solar 32000 Sentencepiece BPE
solar-1-mini-tokenizer 64000 Sentencepiece BPE. Added Ko/JP vocabs

Tokenizing "์•ˆ๋…•ํ•˜์„ธ์š”, ์˜ค๋Š˜์€ ๋‚ ์”จ๊ฐ€ ์ข‹๋„ค์š”."

  • SOLAR-10.7B: 26 tokens
  • Solar-Ko-Recovery: 7 tokens
Model Tokens
SOLAR-10.7B ['โ–', '์•ˆ', '<0xEB>', '<0x85>', '<0x95>', 'ํ•˜', '์„ธ', '์š”', ',', 'โ–', '์˜ค', '<0xEB>', '<0x8A>', '<0x98>', '์€', 'โ–', '๋‚ ', '<0xEC>', '<0x94>', '<0xA8>', '๊ฐ€', 'โ–', '์ข‹', '๋„ค', '์š”', '.']
Solar-Ko-Recovery ['โ–์•ˆ๋…•ํ•˜์„ธ์š”', ',', 'โ–์˜ค๋Š˜์€', 'โ–๋‚ ์”จ๊ฐ€', 'โ–์ข‹', '๋„ค์š”', '.']

Tokenizing "Meet 10.7B Solar: Elevating Performance with Upstage Depth UP Scaling!"

  • SOLAR-10.7B: 22 tokens
  • Solar-Ko-Recovery: 22 tokens
Model Tokens
SOLAR-10.7B ['โ–Meet', 'โ–', '1', '0', '.', '7', 'B', 'โ–Solar', ':', 'โ–E', 'lev', 'ating', 'โ–Performance', 'โ–with', 'โ–Up', 'stage', 'โ–Dep', 'th', 'โ–UP', 'โ–Scal', 'ing', '!']
Solar-Ko-Recovery ['โ–Meet', 'โ–', '1', '0', '.', '7', 'B', 'โ–Solar', ':', 'โ–E', 'lev', 'ating', 'โ–Performance', 'โ–with', 'โ–Up', 'stage', 'โ–Dep', 'th', 'โ–UP', 'โ–Scal', 'ing', '!']

LICENSE

Apache 2.0

Model Benchmark

LM Eval Harness - Korean

Tasks Metric Value Stderr
haerae acc_norm 0.7874 ยฑ 0.0118
- haerae_general_knowledge acc 0.5000 ยฑ 0.0378
- haerae_history acc 0.8723 ยฑ 0.0244
- haerae_loan_word acc 0.8402 ยฑ 0.0283
- haerae_rare_word acc 0.8346 ยฑ 0.0185
- haerae_standard_nomenclature acc 0.8301 ยฑ 0.0305
kmmlu_direct exact_match 0.4205 ยฑ 0.0026
- kmmlu_direct_accounting exact_match 0.3700 ยฑ 0.0485
- kmmlu_direct_agricultural_sciences exact_match 0.3140 ยฑ 0.0147
- kmmlu_direct_aviation_engineering_and_maintenance exact_match 0.3870 ยฑ 0.0154
- kmmlu_direct_biology exact_match 0.3510 ยฑ 0.0151
- kmmlu_direct_chemical_engineering exact_match 0.3910 ยฑ 0.0154
- kmmlu_direct_chemistry exact_match 0.4000 ยฑ 0.0200
- kmmlu_direct_civil_engineering exact_match 0.4010 ยฑ 0.0155
- kmmlu_direct_computer_science exact_match 0.6520 ยฑ 0.0151
- kmmlu_direct_construction exact_match 0.3080 ยฑ 0.0146
- kmmlu_direct_criminal_law exact_match 0.3100 ยฑ 0.0328
- kmmlu_direct_ecology exact_match 0.4660 ยฑ 0.0158
- kmmlu_direct_economics exact_match 0.5385 ยฑ 0.0439
- kmmlu_direct_education exact_match 0.6200 ยฑ 0.0488
- kmmlu_direct_electrical_engineering exact_match 0.3000 ยฑ 0.0145
- kmmlu_direct_electronics_engineering exact_match 0.4740 ยฑ 0.0158
- kmmlu_direct_energy_management exact_match 0.3560 ยฑ 0.0151
- kmmlu_direct_environmental_science exact_match 0.2980 ยฑ 0.0145
- kmmlu_direct_fashion exact_match 0.4470 ยฑ 0.0157
- kmmlu_direct_food_processing exact_match 0.3690 ยฑ 0.0153
- kmmlu_direct_gas_technology_and_engineering exact_match 0.3000 ยฑ 0.0145
- kmmlu_direct_geomatics exact_match 0.3820 ยฑ 0.0154
- kmmlu_direct_health exact_match 0.5700 ยฑ 0.0498
- kmmlu_direct_industrial_engineer exact_match 0.3830 ยฑ 0.0154
- kmmlu_direct_information_technology exact_match 0.6090 ยฑ 0.0154
- kmmlu_direct_interior_architecture_and_design exact_match 0.5440 ยฑ 0.0158
- kmmlu_direct_korean_history exact_match 0.3800 ยฑ 0.0488
- kmmlu_direct_law exact_match 0.4670 ยฑ 0.0158
- kmmlu_direct_machine_design_and_manufacturing exact_match 0.3960 ยฑ 0.0155
- kmmlu_direct_management exact_match 0.5030 ยฑ 0.0158
- kmmlu_direct_maritime_engineering exact_match 0.4283 ยฑ 0.0202
- kmmlu_direct_marketing exact_match 0.7460 ยฑ 0.0138
- kmmlu_direct_materials_engineering exact_match 0.4020 ยฑ 0.0155
- kmmlu_direct_math exact_match 0.2867 ยฑ 0.0262
- kmmlu_direct_mechanical_engineering exact_match 0.3490 ยฑ 0.0151
- kmmlu_direct_nondestructive_testing exact_match 0.3760 ยฑ 0.0153
- kmmlu_direct_patent exact_match 0.3700 ยฑ 0.0485
- kmmlu_direct_political_science_and_sociology exact_match 0.5300 ยฑ 0.0289
- kmmlu_direct_psychology exact_match 0.4470 ยฑ 0.0157
- kmmlu_direct_public_safety exact_match 0.3520 ยฑ 0.0151
- kmmlu_direct_railway_and_automotive_engineering exact_match 0.3220 ยฑ 0.0148
- kmmlu_direct_real_estate exact_match 0.4350 ยฑ 0.0351
- kmmlu_direct_refrigerating_machinery exact_match 0.3240 ยฑ 0.0148
- kmmlu_direct_social_welfare exact_match 0.4970 ยฑ 0.0158
- kmmlu_direct_taxation exact_match 0.3800 ยฑ 0.0344
- kmmlu_direct_telecommunications_and_wireless_technology exact_match 0.5480 ยฑ 0.0157
kobest_boolq acc 0.9202 ยฑ 0.0072
f1 0.9202 ยฑ N/A
kobest_copa acc 0.8680 ยฑ 0.0107
f1 0.8678 ยฑ N/A
kobest_hellaswag acc 0.5560 ยฑ 0.0222
f1 0.5520 ยฑ N/A
acc_norm 0.6540 ยฑ 0.0213
kobest_sentineg acc 0.9824 ยฑ 0.0066
f1 0.9824 ยฑ N/A

Citation

TBD

Acknowledgements

Downloads last month
19
Safetensors
Model size
11B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for beomi/Solar-Ko-Recovery-11B

Finetuned
(42)
this model
Finetunes
5 models
Quantizations
5 models

Space using beomi/Solar-Ko-Recovery-11B 1