complexly's picture
llm001 L04 Continue: 1 epoch
5727d10 verified
metadata
base_model: complexly/olmo3-190m-zh-continue
license: apache-2.0
language:
  - zh
tags:
  - llm001
  - olmo3
  - chinese
  - continued-pretraining

complexly/olmo3-190m-zh-continue

持续预训练版本:基于 complexly/olmo3-190m-zh-full,在42ailab/llm101-v3.1-data数据上继续训练,增强对事实和逻辑的掌握。 训练完成后training loss从3.19降到2.60左右,eval loss为1.84左右

训练配置

  • 数据:42ailab/llm101-v3.1-data/full_v31.bin
  • GPU:A800, 集群slurm+apptainer容器
  • LR:2e-4(低 LR 防止灾难性遗忘)
  • Warmup:10%
  • max_steps=-1, bs=25×3=73

用法

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("complexly/olmo3-190m-zh-continue")
tok = AutoTokenizer.from_pretrained("complexly/olmo3-190m-zh-continue")