complexly commited on
Commit
c6e332c
·
verified ·
1 Parent(s): 0618a75

llm001 L04 Full: 190M, 1 epoch, new tokenizer

Browse files
Files changed (5) hide show
  1. README.md +2 -2
  2. chat_template.jinja +8 -0
  3. model.safetensors +1 -1
  4. tokenizer.json +0 -0
  5. training_args.bin +1 -1
README.md CHANGED
@@ -12,7 +12,7 @@ tags:
12
 
13
  # OLMo3-190M-zh-full
14
 
15
- 为零基础 AI 大模型研发训练营(llm001)L04 Full 模型(190M 参数,1 epoch完整训练)。完整训练该模型training loss 3.486, eval loss 3.417
16
 
17
  ## 模型配置
18
 
@@ -21,7 +21,7 @@ tags:
21
 
22
  ## 训练配置
23
 
24
- - 数据:cmz1024/llm101-olmo3-zh-demo-data (500M tokens)
25
  - 训练:A800, max_steps=-1, bs=24×5=120, lr=5e-4, bf16
26
 
27
  ## 用法
 
12
 
13
  # OLMo3-190M-zh-full
14
 
15
+ 为零基础 AI 大模型研发训练营(llm001)L04 Full 模型(190M 参数,1 epoch完整训练)。完整训练该模型training loss 3.521, eval loss 3.450
16
 
17
  ## 模型配置
18
 
 
21
 
22
  ## 训练配置
23
 
24
+ - 数据:cmz1024/llm101-olmo3-zh-demo-data (500M tokens),但使用42ailab/OLMo3-190M-zh版本tokenizer重新转换
25
  - 训练:A800, max_steps=-1, bs=24×5=120, lr=5e-4, bf16
26
 
27
  ## 用法
chat_template.jinja ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {% for message in messages %}{% if message['role'] == 'system' %}<|im_start|>system
2
+ {{ message['content'] }}<|im_end|>
3
+ {% elif message['role'] == 'user' %}<|im_start|>user
4
+ {{ message['content'] }}<|im_end|>
5
+ {% elif message['role'] == 'assistant' %}{% generation %}<|im_start|>assistant
6
+ {{ message['content'] }}<|im_end|>
7
+ {% endgeneration %}{% endif %}{% endfor %}{% if add_generation_prompt %}<|im_start|>assistant
8
+ {% endif %}
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3147f0f3a8e497b38b0597b06335b82a9b955c098daa16ba8d5b31672ed1d2dd
3
  size 748062408
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:951c04c4c0d96c7cf9f2bdabe21a71446cb346d9af3ee0f90ec6bffab7b127b6
3
  size 748062408
tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff
 
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cd5cf790556af41f1702786492c81bc76db90675b4214b492a337b993cd72624
3
  size 4920
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:36eb6f872a7d5799854ad64917eadb060a6692f3a02426435945b3d2a836c931
3
  size 4920