LLM-OS-Models
/

KoHRM-Text-1.4B

Text Generation

Model card Files Files and versions

gyung commited on 1 day ago

Commit

5dae8f6

·

verified ·

1 Parent(s): 094caab

Add files using upload-large-folder tool

Files changed (2) hide show

README.md +2 -2
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -43,7 +43,7 @@ The main model repository is intended to expose the latest model-only artifact:
 It is not intended to keep every training checkpoint as visible model files. Intermediate FSDP2 `.distcp` checkpoints are large resume artifacts and are kept separately in `LLM-OS-Models/KoHRM-Text-1.4B-raw-checkpoints` when needed. The main repo may still have normal Hugging Face git history, but the current file tree should be treated as the latest public model export.
-Current public artifact: `stage3` local-terminal continuation checkpoint at `step_170000`, converted with EMA weights to `safetensors`. Training is still in progress; this is an intermediate checkpoint from the ongoing `stage3-local-terminal` run.
 ## Model Details
@@ -224,7 +224,7 @@ Current long-running stage-2 settings:
 | LR | 2.2e-4 |
 | LR warmup | 2,000 steps |
 | Checkpoint interval | 10,000 steps |
-| Current public export | `stage3 step_170000`, EMA, safetensors |
 The run uses staged continuation. The checkpoint carries model, optimizer, EMA, and recurrent carry state forward. `resume_step_offset` and `total_steps_override` are used so the learning-rate schedule follows the intended longer pretraining run rather than resetting at every data stage.

 It is not intended to keep every training checkpoint as visible model files. Intermediate FSDP2 `.distcp` checkpoints are large resume artifacts and are kept separately in `LLM-OS-Models/KoHRM-Text-1.4B-raw-checkpoints` when needed. The main repo may still have normal Hugging Face git history, but the current file tree should be treated as the latest public model export.
+Current public artifact: `stage3` local-terminal continuation checkpoint at `step_180000`, converted with EMA weights to `safetensors`. Training is still in progress; this is an intermediate checkpoint from the ongoing `stage3-local-terminal` run.
 ## Model Details
 | LR | 2.2e-4 |
 | LR warmup | 2,000 steps |
 | Checkpoint interval | 10,000 steps |
+| Current public export | `stage3 step_180000`, EMA, safetensors |
 The run uses staged continuation. The checkpoint carries model, optimizer, EMA, and recurrent carry state forward. `resume_step_offset` and `total_steps_override` are used so the learning-rate schedule follows the intended longer pretraining run rather than resetting at every data stage.

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9e51bccfd3416772b2b8c4b9e5dd23054863aee20b2a637f70ee914372a6a262
 size 2768259784

 version https://git-lfs.github.com/spec/v1
+oid sha256:b58990a81bc865eba09890f6f53bd2080d1e5b901e647e80b0864ee0bd6e7b2d
 size 2768259784