Experiment Objectives
- Is Training with Korean + Multi-lingual dataset helpful to perform Korean benchmarks?
- Does Full Parameter Depth-Up Scaled Training (expansion method: Llama-Pro) help to perform the best Korean benchmark performance?
Methods
- Training CJK + En + Glot dataset with the same ratio of data size.
- Layer Expansion and full parameter training.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support