| base_model: complexly/olmo3-190m-zh-continue | |
| license: apache-2.0 | |
| language: | |
| - zh | |
| tags: | |
| - llm001 | |
| - olmo3 | |
| - chinese | |
| - continued-pretraining | |
| # complexly/olmo3-190m-zh-continue | |
| 持续预训练版本:基于 complexly/olmo3-190m-zh-full,在42ailab/llm101-v3.1-data数据上继续训练,增强对事实和逻辑的掌握。 | |
| 训练完成后training loss从3.19降到2.60左右,eval loss为1.84左右 | |
| ## 训练配置 | |
| - 数据:42ailab/llm101-v3.1-data/full_v31.bin | |
| - GPU:A800, 集群slurm+apptainer容器 | |
| - LR:2e-4(低 LR 防止灾难性遗忘) | |
| - Warmup:10% | |
| - max_steps=-1, bs=25×3=73 | |
| ## 用法 | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model = AutoModelForCausalLM.from_pretrained("complexly/olmo3-190m-zh-continue") | |
| tok = AutoTokenizer.from_pretrained("complexly/olmo3-190m-zh-continue") | |
| ``` | |