omitakahiro commited on
Commit
dc77eb0
·
verified ·
1 Parent(s): 9ca9af4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -12,7 +12,9 @@ language:
12
 
13
  ## Model description
14
 
15
- **Stockmark-2-100B-Instruct-beta** is a 100B parameter large language model specialized in Japanese. It was built from scratch through pre-training with 1.5T tokens and post-training.
 
 
16
 
17
  See [our blog](???) for the detail.
18
 
 
12
 
13
  ## Model description
14
 
15
+ **Stockmark-2-100B-Instruct-beta** is a 100B parameter large language model built from scratch, which is particulary specialized in Japanese. The model was pretrained on approximately 1.5 trillion tokens of data (60% English, 30% Japanese, 10% Code). After pretraining, the model underwent post-training using synthetic data to enhance its instruction-following abilities. The synthetic data was generated using Qwen2.5-32B-Instruct.
16
+
17
+ As a beta release, Stockmark-2-100b-Instruct-beta is still undergoing improvements and evaluations. Feedback and insights from users will help refine future versions.
18
 
19
  See [our blog](???) for the detail.
20