Add files using upload-large-folder tool
Browse files- README.md +2 -2
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -43,7 +43,7 @@ The main model repository is intended to expose the latest model-only artifact:
|
|
| 43 |
|
| 44 |
It is not intended to keep every training checkpoint as visible model files. Intermediate FSDP2 `.distcp` checkpoints are large resume artifacts and are kept separately in `LLM-OS-Models/KoHRM-Text-1.4B-raw-checkpoints` when needed. The main repo may still have normal Hugging Face git history, but the current file tree should be treated as the latest public model export.
|
| 45 |
|
| 46 |
-
Current public artifact: `stage3` local-terminal continuation checkpoint at `
|
| 47 |
|
| 48 |
## Model Details
|
| 49 |
|
|
@@ -224,7 +224,7 @@ Current long-running stage-2 settings:
|
|
| 224 |
| LR | 2.2e-4 |
|
| 225 |
| LR warmup | 2,000 steps |
|
| 226 |
| Checkpoint interval | 10,000 steps |
|
| 227 |
-
| Current public export | `stage3
|
| 228 |
|
| 229 |
The run uses staged continuation. The checkpoint carries model, optimizer, EMA, and recurrent carry state forward. `resume_step_offset` and `total_steps_override` are used so the learning-rate schedule follows the intended longer pretraining run rather than resetting at every data stage.
|
| 230 |
|
|
|
|
| 43 |
|
| 44 |
It is not intended to keep every training checkpoint as visible model files. Intermediate FSDP2 `.distcp` checkpoints are large resume artifacts and are kept separately in `LLM-OS-Models/KoHRM-Text-1.4B-raw-checkpoints` when needed. The main repo may still have normal Hugging Face git history, but the current file tree should be treated as the latest public model export.
|
| 45 |
|
| 46 |
+
Current public artifact: `stage3` local-terminal continuation checkpoint at `step_180000`, converted with EMA weights to `safetensors`. Training is still in progress; this is an intermediate checkpoint from the ongoing `stage3-local-terminal` run.
|
| 47 |
|
| 48 |
## Model Details
|
| 49 |
|
|
|
|
| 224 |
| LR | 2.2e-4 |
|
| 225 |
| LR warmup | 2,000 steps |
|
| 226 |
| Checkpoint interval | 10,000 steps |
|
| 227 |
+
| Current public export | `stage3 step_180000`, EMA, safetensors |
|
| 228 |
|
| 229 |
The run uses staged continuation. The checkpoint carries model, optimizer, EMA, and recurrent carry state forward. `resume_step_offset` and `total_steps_override` are used so the learning-rate schedule follows the intended longer pretraining run rather than resetting at every data stage.
|
| 230 |
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 2768259784
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b58990a81bc865eba09890f6f53bd2080d1e5b901e647e80b0864ee0bd6e7b2d
|
| 3 |
size 2768259784
|