Update Log
- 2024.07.01: Released Solar-Ko-Recovery & Uploaded Benchmark scores
- 2024.05.16: Preview Released Solar-Ko-Recovery
Solar-Ko-Recovery-11B ๐โค๏ธโ๐ฉน
Solar-Ko-Recovery-11B aimed to recover Solar's capability on Korean with re-arrange of Embeddings and LM head, featuring an expanded vocabulary and the inclusion of a Korean+English corpus for enhanced representation.
Model Details
Model Developers: Junbum Lee (Beomi)
Variations: Solar-Ko-Recovery is available with one parameter sizes โ 11B(10.99B๐คฃ).
Input: The model accepts only text input.
Output: The model produces text output exclusively.
Model Architecture:
Solar-Ko-Recovery is an auto-regressive language model that leverages an optimized transformer architecture derived from Llama-2.
| Training Data | Parameters | Content Length | GQA | Tokens | Learning Rate | |
|---|---|---|---|---|---|---|
| Solar-Ko-Recovery | A curated mix of Korean+English Corpora | 11B(10.99B) | 4k | O | >100B* | 5e-5 |
NOTE: 2-step training processed
- Only Embedding layer and LM Head layer are trained
- Full params trained
Vocab Expansion
Vocab expansion is conducted on edited upstage/solar-1-mini-tokenizer, which is superset of Solar tokenizer.
| Model Name | Vocabulary Size | Description |
|---|---|---|
| Original Solar | 32000 | Sentencepiece BPE |
| solar-1-mini-tokenizer | 64000 | Sentencepiece BPE. Added Ko/JP vocabs |
Tokenizing "์๋ ํ์ธ์, ์ค๋์ ๋ ์จ๊ฐ ์ข๋ค์."
- SOLAR-10.7B: 26 tokens
- Solar-Ko-Recovery: 7 tokens
| Model | Tokens |
|---|---|
| SOLAR-10.7B | ['โ', '์', '<0xEB>', '<0x85>', '<0x95>', 'ํ', '์ธ', '์', ',', 'โ', '์ค', '<0xEB>', '<0x8A>', '<0x98>', '์', 'โ', '๋ ', '<0xEC>', '<0x94>', '<0xA8>', '๊ฐ', 'โ', '์ข', '๋ค', '์', '.'] |
| Solar-Ko-Recovery | ['โ์๋
ํ์ธ์', ',', 'โ์ค๋์', 'โ๋ ์จ๊ฐ', 'โ์ข', '๋ค์', '.'] |
Tokenizing "Meet 10.7B Solar: Elevating Performance with Upstage Depth UP Scaling!"
- SOLAR-10.7B: 22 tokens
- Solar-Ko-Recovery: 22 tokens
| Model | Tokens |
|---|---|
| SOLAR-10.7B | ['โMeet', 'โ', '1', '0', '.', '7', 'B', 'โSolar', ':', 'โE', 'lev', 'ating', 'โPerformance', 'โwith', 'โUp', 'stage', 'โDep', 'th', 'โUP', 'โScal', 'ing', '!'] |
| Solar-Ko-Recovery | ['โMeet', 'โ', '1', '0', '.', '7', 'B', 'โSolar', ':', 'โE', 'lev', 'ating', 'โPerformance', 'โwith', 'โUp', 'stage', 'โDep', 'th', 'โUP', 'โScal', 'ing', '!'] |
LICENSE
Apache 2.0
Model Benchmark
LM Eval Harness - Korean
- Used EleutherAI's lm-evaluation-harness
- 5-shot scores
| Tasks | Metric | Value | Stderr | |
|---|---|---|---|---|
| haerae | acc_norm | 0.7874 | ยฑ | 0.0118 |
| - haerae_general_knowledge | acc | 0.5000 | ยฑ | 0.0378 |
| - haerae_history | acc | 0.8723 | ยฑ | 0.0244 |
| - haerae_loan_word | acc | 0.8402 | ยฑ | 0.0283 |
| - haerae_rare_word | acc | 0.8346 | ยฑ | 0.0185 |
| - haerae_standard_nomenclature | acc | 0.8301 | ยฑ | 0.0305 |
| kmmlu_direct | exact_match | 0.4205 | ยฑ | 0.0026 |
| - kmmlu_direct_accounting | exact_match | 0.3700 | ยฑ | 0.0485 |
| - kmmlu_direct_agricultural_sciences | exact_match | 0.3140 | ยฑ | 0.0147 |
| - kmmlu_direct_aviation_engineering_and_maintenance | exact_match | 0.3870 | ยฑ | 0.0154 |
| - kmmlu_direct_biology | exact_match | 0.3510 | ยฑ | 0.0151 |
| - kmmlu_direct_chemical_engineering | exact_match | 0.3910 | ยฑ | 0.0154 |
| - kmmlu_direct_chemistry | exact_match | 0.4000 | ยฑ | 0.0200 |
| - kmmlu_direct_civil_engineering | exact_match | 0.4010 | ยฑ | 0.0155 |
| - kmmlu_direct_computer_science | exact_match | 0.6520 | ยฑ | 0.0151 |
| - kmmlu_direct_construction | exact_match | 0.3080 | ยฑ | 0.0146 |
| - kmmlu_direct_criminal_law | exact_match | 0.3100 | ยฑ | 0.0328 |
| - kmmlu_direct_ecology | exact_match | 0.4660 | ยฑ | 0.0158 |
| - kmmlu_direct_economics | exact_match | 0.5385 | ยฑ | 0.0439 |
| - kmmlu_direct_education | exact_match | 0.6200 | ยฑ | 0.0488 |
| - kmmlu_direct_electrical_engineering | exact_match | 0.3000 | ยฑ | 0.0145 |
| - kmmlu_direct_electronics_engineering | exact_match | 0.4740 | ยฑ | 0.0158 |
| - kmmlu_direct_energy_management | exact_match | 0.3560 | ยฑ | 0.0151 |
| - kmmlu_direct_environmental_science | exact_match | 0.2980 | ยฑ | 0.0145 |
| - kmmlu_direct_fashion | exact_match | 0.4470 | ยฑ | 0.0157 |
| - kmmlu_direct_food_processing | exact_match | 0.3690 | ยฑ | 0.0153 |
| - kmmlu_direct_gas_technology_and_engineering | exact_match | 0.3000 | ยฑ | 0.0145 |
| - kmmlu_direct_geomatics | exact_match | 0.3820 | ยฑ | 0.0154 |
| - kmmlu_direct_health | exact_match | 0.5700 | ยฑ | 0.0498 |
| - kmmlu_direct_industrial_engineer | exact_match | 0.3830 | ยฑ | 0.0154 |
| - kmmlu_direct_information_technology | exact_match | 0.6090 | ยฑ | 0.0154 |
| - kmmlu_direct_interior_architecture_and_design | exact_match | 0.5440 | ยฑ | 0.0158 |
| - kmmlu_direct_korean_history | exact_match | 0.3800 | ยฑ | 0.0488 |
| - kmmlu_direct_law | exact_match | 0.4670 | ยฑ | 0.0158 |
| - kmmlu_direct_machine_design_and_manufacturing | exact_match | 0.3960 | ยฑ | 0.0155 |
| - kmmlu_direct_management | exact_match | 0.5030 | ยฑ | 0.0158 |
| - kmmlu_direct_maritime_engineering | exact_match | 0.4283 | ยฑ | 0.0202 |
| - kmmlu_direct_marketing | exact_match | 0.7460 | ยฑ | 0.0138 |
| - kmmlu_direct_materials_engineering | exact_match | 0.4020 | ยฑ | 0.0155 |
| - kmmlu_direct_math | exact_match | 0.2867 | ยฑ | 0.0262 |
| - kmmlu_direct_mechanical_engineering | exact_match | 0.3490 | ยฑ | 0.0151 |
| - kmmlu_direct_nondestructive_testing | exact_match | 0.3760 | ยฑ | 0.0153 |
| - kmmlu_direct_patent | exact_match | 0.3700 | ยฑ | 0.0485 |
| - kmmlu_direct_political_science_and_sociology | exact_match | 0.5300 | ยฑ | 0.0289 |
| - kmmlu_direct_psychology | exact_match | 0.4470 | ยฑ | 0.0157 |
| - kmmlu_direct_public_safety | exact_match | 0.3520 | ยฑ | 0.0151 |
| - kmmlu_direct_railway_and_automotive_engineering | exact_match | 0.3220 | ยฑ | 0.0148 |
| - kmmlu_direct_real_estate | exact_match | 0.4350 | ยฑ | 0.0351 |
| - kmmlu_direct_refrigerating_machinery | exact_match | 0.3240 | ยฑ | 0.0148 |
| - kmmlu_direct_social_welfare | exact_match | 0.4970 | ยฑ | 0.0158 |
| - kmmlu_direct_taxation | exact_match | 0.3800 | ยฑ | 0.0344 |
| - kmmlu_direct_telecommunications_and_wireless_technology | exact_match | 0.5480 | ยฑ | 0.0157 |
| kobest_boolq | acc | 0.9202 | ยฑ | 0.0072 |
| f1 | 0.9202 | ยฑ | N/A | |
| kobest_copa | acc | 0.8680 | ยฑ | 0.0107 |
| f1 | 0.8678 | ยฑ | N/A | |
| kobest_hellaswag | acc | 0.5560 | ยฑ | 0.0222 |
| f1 | 0.5520 | ยฑ | N/A | |
| acc_norm | 0.6540 | ยฑ | 0.0213 | |
| kobest_sentineg | acc | 0.9824 | ยฑ | 0.0066 |
| f1 | 0.9824 | ยฑ | N/A |
Citation
TBD
Acknowledgements
- Training support was provided by the TPU Research Cloud program.
- Downloads last month
- 19
Model tree for beomi/Solar-Ko-Recovery-11B
Base model
upstage/SOLAR-10.7B-v1.0