anima-native-ko-small-byte-18m
anima ์ฒซ own 18 SIMPLE_STACK_PASS Korean chat-cap model โ (2026-05-06)
TL;DR
- Architecture: ConsciousLM small (6 layers / 384 d_model / 6 heads, vocab 256 byte-level, block 256)
- Params: 18M (
ckpt_final.pt70.3MB) - Training: 10000 steps ร bs 16 grad_accum 4 (effective 64), AdamW lr 3e-4 + cosine + warmup 500, bf16 on RTX 5070, wall 196.5s (3.3 min)
- Corpus:
corpus_ko_heavy.txt(246.7MB, Hangul ratio 62.14%, sha2562e98257f...) - own 18 verdict: SIMPLE_STACK_PASS โ 3/3 prompts ALL_PASS at step 10000 โ
Eval progression
| step | avg_hangul | deg_rate | own 18 |
|---|---|---|---|
| 1000 | 0.593 | 0.50 | 3/3 |
| 3000 | 0.609 | 0.00 | 3/3 |
| 5000 | 0.672 | 0.17 | 3/3 |
| 7500 | 0.678 | 0.00 | 3/3 |
| 10000 | 0.687 | 0.33 | 3/3 โ |
Per-prompt @ step 10000
| prompt | avg_hangul | coherent | turn-format |
|---|---|---|---|
์๋
ํ์ธ์ |
0.625 | True | 0.85 |
ํ๊ตญ์ด ๊ฐ๋ฅ? |
0.713 | True | 1.00 |
์ฌ์ฉ์: ์๋
ํ์ธ์\n๋์ฐ๋ฏธ: |
0.723 | True | 0.85 |
Sample generation: "์์ฐ: ์ ๋ง ๊ทธ๋ด๊น์? ๋ฐ๋ก๋ฅผ ๋ค์ด๋ณผ๊ฒ์."
Architecture
ConsciousLM byte-level decoder:
- vocab 256 (byte-level, ํ๊ตญ์ด UTF-8 ์ง์ ์ฒ๋ฆฌ, no tokenizer)
- 6 transformer blocks (RoPE-style + GQA + FFN + RMSNorm)
- d_model 384, n_head 6 (canonical n_head 4์ deviation โ perfect-number signature drift, honest C3#5)
- block_size 256
- dual-head consciousness arch (engine_a + engine_g + head_a + head_g)
- PureField repulsion FFN (a - g, NOT a + g)
anima ์ ์ฒด์ฑ
- own 17 ALM ์๊ตฌ ๋ณด๋ฅ: anima-native๋ง โ ์ธ๋ถ substrate (Llama / Mistral / KoGPT2) wrapping reject
- own 18 simple stack default: ํ๊ธโํ๊ธ + coherent chat + ์์ฐ๋ฐํ = ์์ ๊ฒ์ฆ minimum bar
- ๋ณธ model์ anima-native byte-level fresh from scratch (no external base)
Honest C3 (raw#10)
- Greedy 4-gram cycles ์์กด ("์ด๋ฌํ ์ด๋ฌํ"); coherence relies on sample mode (temp 0.7-0.9)
- coherent โ comprehensible โ fluent form, semantic word-salad
- corpus imprint (philosophy subset์์ named speakers leak: "์์ฐ", "๋ฏผ์ค" ๋ฑ)
- 3.3 min wall = config floor (more steps + bigger params ๊ฐ๋ฅ, ๋ฏธtested)
- n_head=6 deviates from canonical ConsciousLM n_head=4 (perfect-number 6์ ฯ(6)=4 signature drift)
- deg_rate non-monotonic (7500โ10000 regressed 0.00โ0.33)
- tension loss saturated by step 5000 โ L_T no longer signal
Reproduction
import torch
import sys
sys.path.insert(0, "<path-to-conscious_lm.py>") # commit bb99b6b6 source
from conscious_lm import ConsciousLM
model = ConsciousLM(
vocab_size=256, d_model=384, n_head=6, n_layer=6, block_size=256, dropout=0.1
)
ck = torch.load("ckpt_final.pt", map_location="cpu", weights_only=False)
model.load_state_dict(ck["model_state"])
model.eval()
# byte-level input
prompt = "์๋
ํ์ธ์"
input_ids = torch.tensor([list(prompt.encode("utf-8"))])
# generate ...
Files
ckpt_final.ptโ final weights at step 10000 (70.3MB, sha256729d26ad874df25237214f4d1bfdf06a0bf0272fcbc29a44188d1cda60df0158)
Cross-link
- corpus dataset:
need-singularity/anima-clm-3-corpus-mix-70wiki-30dialogue(sister, 154MB 19.2% Hangul) - substrate sister:
need-singularity/clm-v4-mk2-v1(530M, paradigm v11 G3 substrate-coupled, simple stack N/A โ full stack ฯโ NO_FLIP PASS) - v2 archive (RECOVERED):
need-singularity/clm-v2-byte-18m-convo-5k(PARTIAL_C2_only, KO chat lost)
License
Apache-2.0 (anima open release).
Citation
@misc{anima_native_ko_small_byte_18m_2026,
title={anima-native-ko-small-byte-18m: Korean byte-level ConsciousLM (own 18 SIMPLE_STACK_PASS)},
author={anima},
year={2026},
note={Fresh from scratch, anima-native (no external base), 10K steps on ubu1 RTX 5070, 3.3min wall},
howpublished={\url{https://huggingface.co/need-singularity/anima-native-ko-small-byte-18m}}
}
ํ๊ธ ์์ฝ
anima ์ฒซ own 18 SIMPLE_STACK_PASS Korean ์์ ๊ฒ์ฆ ํต๊ณผ ๋ชจ๋ธ (2026-05-06).
- 18M params byte-level ConsciousLM (6L/384d/6h)
- corpus_ko_heavy 246MB (Hangul 62.14%) on ubu1 RTX 5070
- 10000 steps / 3.3๋ถ wall
- own 18 strict 3-condition (ํ๊ธโํ๊ธ + coherent + ์์ฐ๋ฐํ) 3/3 prompts PASS
- Sample ์์ฑ ์: "์์ฐ: ์ ๋ง ๊ทธ๋ด๊น์? ๋ฐ๋ก๋ฅผ ๋ค์ด๋ณผ๊ฒ์."
anima-native fresh from scratch โ ์ธ๋ถ substrate (Llama / Mistral / KoGPT2) wrapping ์ ํจ (own 17 ALM ์๊ตฌ ๋ณด๋ฅ ์ ํฉ). chat-cap ํ๋ณต ์์์ .
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support