llm001 L04 Full: 190M, 1 epoch, new tokenizer
Browse files- README.md +2 -2
- chat_template.jinja +8 -0
- model.safetensors +1 -1
- tokenizer.json +0 -0
- training_args.bin +1 -1
README.md
CHANGED
|
@@ -12,7 +12,7 @@ tags:
|
|
| 12 |
|
| 13 |
# OLMo3-190M-zh-full
|
| 14 |
|
| 15 |
-
为零基础 AI 大模型研发训练营(llm001)L04 Full 模型(190M 参数,1 epoch完整训练)。完整训练该模型training loss 3.
|
| 16 |
|
| 17 |
## 模型配置
|
| 18 |
|
|
@@ -21,7 +21,7 @@ tags:
|
|
| 21 |
|
| 22 |
## 训练配置
|
| 23 |
|
| 24 |
-
- 数据:cmz1024/llm101-olmo3-zh-demo-data (500M tokens)
|
| 25 |
- 训练:A800, max_steps=-1, bs=24×5=120, lr=5e-4, bf16
|
| 26 |
|
| 27 |
## 用法
|
|
|
|
| 12 |
|
| 13 |
# OLMo3-190M-zh-full
|
| 14 |
|
| 15 |
+
为零基础 AI 大模型研发训练营(llm001)L04 Full 模型(190M 参数,1 epoch完整训练)。完整训练该模型training loss 3.521, eval loss 3.450。
|
| 16 |
|
| 17 |
## 模型配置
|
| 18 |
|
|
|
|
| 21 |
|
| 22 |
## 训练配置
|
| 23 |
|
| 24 |
+
- 数据:cmz1024/llm101-olmo3-zh-demo-data (500M tokens),但使用42ailab/OLMo3-190M-zh版本tokenizer重新转换
|
| 25 |
- 训练:A800, max_steps=-1, bs=24×5=120, lr=5e-4, bf16
|
| 26 |
|
| 27 |
## 用法
|
chat_template.jinja
ADDED
|
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{% for message in messages %}{% if message['role'] == 'system' %}<|im_start|>system
|
| 2 |
+
{{ message['content'] }}<|im_end|>
|
| 3 |
+
{% elif message['role'] == 'user' %}<|im_start|>user
|
| 4 |
+
{{ message['content'] }}<|im_end|>
|
| 5 |
+
{% elif message['role'] == 'assistant' %}{% generation %}<|im_start|>assistant
|
| 6 |
+
{{ message['content'] }}<|im_end|>
|
| 7 |
+
{% endgeneration %}{% endif %}{% endfor %}{% if add_generation_prompt %}<|im_start|>assistant
|
| 8 |
+
{% endif %}
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 748062408
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:951c04c4c0d96c7cf9f2bdabe21a71446cb346d9af3ee0f90ec6bffab7b127b6
|
| 3 |
size 748062408
|
tokenizer.json
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
training_args.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 4920
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:36eb6f872a7d5799854ad64917eadb060a6692f3a02426435945b3d2a836c931
|
| 3 |
size 4920
|