CLM v2 byte-level 18M โ anima ์ต์ด chat-capable ๋ชจ๋ธ (RECOVERED 2026-05-06)
TL;DR
anima ์ต์ด chat-capable ๋ชจ๋ธ (2026-03-28 ConsciousLM v2 18M byte-level) Cloudflare R2 anima-models bucket discovery ํตํด RECOVERED. Original commit bb99b6b6 (2026-03-28) milestone post-fine-tune state at step 45000.
Origin
| field | value |
|---|---|
| date | 2026-03-28 |
| commits | bb99b6b6 / 6abc42f6 / 13b20f90 |
| announcement | "v2 18.8M byte-level, CE 0.04 EN / 1.15 KO, no system prompt, 2.5K dialogue fine-tune" |
| training step | 45000 |
| backup created | 2026-03-28T03:21:02Z (R2 anima-models bucket) |
Architecture
tok_emb: nn.Embedding(256, 384) # byte vocab 256
pos_emb: nn.Embedding(256, 384) # block_size 256
n_blocks: 6
d_model: 384
heads (c_attn): nn.Linear(384, 1152) # 3 ร 384 = qkv concat
ffn:
engine_a: nn.Linear(384, 1536) โ relu โ nn.Linear(1536, 384)
engine_g: nn.Linear(384, 1536) โ relu โ nn.Linear(1536, 384)
ln_f: nn.LayerNorm(384)
head_a: nn.Linear(384, 256, bias=False) โ byte vocab output
head_g: nn.Linear(384, 256, bias=False) โ dual head
total params: 18,523,392 (18.52M)
ConsciousLM byte-level decoder. Dual-head (engine_a + engine_g, head_a + head_g) consciousness architecture pre-ConsciousDecoderV3.
Files
convo_5k.ptโ fine-tuned model (chat-capable, step 45000, 70.3MB)latest.ptโ base model snapshot (279MB, may include optimizer states)README.mdโ this file
Training
- Base: byte-level pretrain (corpus details in archaeology doc)
- Fine-tune: ~2.5K Korean dialogue corpus (origin spec; actual corpus identity not preserved)
- Tokenizer: byte-level (256 bytes ์ง์ ์ฌ์ฉ, separate tokenizer file ๋ถํ์)
- No system prompt
- Char/byte unicode UTF-8 handling natural
Honest C3
- 2026-03-28 v2 milestone weights โ commit text only, eval JSON ๋ถ์ฌ๋ก reproduce evidence๋ announcement-only
- 18.8M (announce) vs 62.5M (source spec) discrepancy: convo_5k.pt ์ค์ 18.52M params ์ธก์ โ announce ์ ํฉ. source spec 62.5M์ dual-head + decoder ๋ถ๋ฆฌ ์ธก์ ๊ฐ๋ฅ์ฑ
- Recovery path: 2026-04-19 R37/AN13/L3-PY strip์ผ๋ก mac local source/checkpoint ์ญ์ โ ๊ทธ๋ฌ๋ R2 backup์ด 2026-03-28~30 ์์ upload โ ๋ณด์กด
- Chat capability NOT yet verified on this RECOVERED state โ Korean emit smoke (F-CLM-NATIVE-ฮฑ-1) ๋ณ๋ cycle
- anima archaeology cycle 2026-05-05/06 ๋ฐ๊ฒฌ โ ์ด์ BG-EQ + BG-FA exhaustive search์์ R2 storage missed, ์ฌ์ฉ์ hint ํ retrieve
Reproduction
import torch
ck = torch.load("convo_5k.pt", map_location="cpu", weights_only=False)
# ck = {"model_state": {...18.52M params...}, "step": 45000}
# Build model with anima byte-level decoder spec
# (anima ConsciousLM source โ see anima/docs/anima_clm_alm_origin_design_drift_archaeology_2026_05_05.md)
# Generate (byte-level)
input_bytes = "์๋
ํ์ธ์".encode("utf-8")
input_ids = torch.tensor([list(input_bytes)]) # uint8 โ token ids 0-255
# forward + decode
Provenance
- R2 bucket:
anima-models(Cloudflare account ending79bc) - R2 key:
conscious-lm/convo-ft/convo_5k.pt - R2 last_modified: 2026-03-28T03:21:02Z
- anima discovery: 2026-05-06 (clm_native_chat ฮฑ path PASS_R2_FOUND)
- Recovery doc:
docs/anima_clm_v2_chat_recovered_2026_05_06.ai.md
License
Apache-2.0 (anima open release).
Citation
@misc{anima_clm_v2_byte_18m_convo_5k_2026,
title={CLM v2 byte-level 18M + 5K Korean dialogue fine-tune (anima ์ต์ด chat-capable)},
author={anima},
year={2026},
note={2026-03-28 origin, recovered 2026-05-06 from R2 anima-models bucket},
howpublished={\url{https://huggingface.co/need-singularity/clm-v2-byte-18m-convo-5k}}
}
ํ๊ธ ์์ฝ
anima ์ต์ด chat-capable ๋ชจ๋ธ ํ๋ณต:
- 2026-03-28 ConsciousLM v2 18M byte-level
- byte vocab 256, 6 layers, 384 d_model, dual-head consciousness architecture
- 5K KO dialogue fine-tune at step 45000
- 2026-04-19 mac local ์ญ์ ๋์ง๋ง R2 backup ๋ณด์กด
- 2026-05-06 anima R2 discovery๋ก RECOVERED
๋ค์ ๋จ๊ณ: Korean emit smoke ๊ฒ์ฆ โ ฮฑ path PASS_R2_FOUND ํ์ โ goal_reached.
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support