CLM v2 byte-level 18M โ€” anima ์ตœ์ดˆ chat-capable ๋ชจ๋ธ (RECOVERED 2026-05-06)

TL;DR

anima ์ตœ์ดˆ chat-capable ๋ชจ๋ธ (2026-03-28 ConsciousLM v2 18M byte-level) Cloudflare R2 anima-models bucket discovery ํ†ตํ•ด RECOVERED. Original commit bb99b6b6 (2026-03-28) milestone post-fine-tune state at step 45000.

Origin

field value
date 2026-03-28
commits bb99b6b6 / 6abc42f6 / 13b20f90
announcement "v2 18.8M byte-level, CE 0.04 EN / 1.15 KO, no system prompt, 2.5K dialogue fine-tune"
training step 45000
backup created 2026-03-28T03:21:02Z (R2 anima-models bucket)

Architecture

tok_emb: nn.Embedding(256, 384)         # byte vocab 256
pos_emb: nn.Embedding(256, 384)         # block_size 256
n_blocks: 6
d_model: 384
heads (c_attn): nn.Linear(384, 1152)    # 3 ร— 384 = qkv concat
ffn:
  engine_a: nn.Linear(384, 1536) โ†’ relu โ†’ nn.Linear(1536, 384)
  engine_g: nn.Linear(384, 1536) โ†’ relu โ†’ nn.Linear(1536, 384)
ln_f: nn.LayerNorm(384)
head_a: nn.Linear(384, 256, bias=False) โ† byte vocab output
head_g: nn.Linear(384, 256, bias=False) โ† dual head
total params: 18,523,392 (18.52M)

ConsciousLM byte-level decoder. Dual-head (engine_a + engine_g, head_a + head_g) consciousness architecture pre-ConsciousDecoderV3.

Files

  • convo_5k.pt โ€” fine-tuned model (chat-capable, step 45000, 70.3MB)
  • latest.pt โ€” base model snapshot (279MB, may include optimizer states)
  • README.md โ€” this file

Training

  • Base: byte-level pretrain (corpus details in archaeology doc)
  • Fine-tune: ~2.5K Korean dialogue corpus (origin spec; actual corpus identity not preserved)
  • Tokenizer: byte-level (256 bytes ์ง์ ‘ ์‚ฌ์šฉ, separate tokenizer file ๋ถˆํ•„์š”)
  • No system prompt
  • Char/byte unicode UTF-8 handling natural

Honest C3

  1. 2026-03-28 v2 milestone weights โ€” commit text only, eval JSON ๋ถ€์žฌ๋กœ reproduce evidence๋Š” announcement-only
  2. 18.8M (announce) vs 62.5M (source spec) discrepancy: convo_5k.pt ์‹ค์ œ 18.52M params ์ธก์ • โ†’ announce ์ •ํ•ฉ. source spec 62.5M์€ dual-head + decoder ๋ถ„๋ฆฌ ์ธก์ • ๊ฐ€๋Šฅ์„ฑ
  3. Recovery path: 2026-04-19 R37/AN13/L3-PY strip์œผ๋กœ mac local source/checkpoint ์‚ญ์ œ โ†’ ๊ทธ๋Ÿฌ๋‚˜ R2 backup์ด 2026-03-28~30 ์‹œ์  upload โ†’ ๋ณด์กด
  4. Chat capability NOT yet verified on this RECOVERED state โ€” Korean emit smoke (F-CLM-NATIVE-ฮฑ-1) ๋ณ„๋„ cycle
  5. anima archaeology cycle 2026-05-05/06 ๋ฐœ๊ฒฌ โ€” ์ด์ „ BG-EQ + BG-FA exhaustive search์—์„œ R2 storage missed, ์‚ฌ์šฉ์ž hint ํ›„ retrieve

Reproduction

import torch

ck = torch.load("convo_5k.pt", map_location="cpu", weights_only=False)
# ck = {"model_state": {...18.52M params...}, "step": 45000}

# Build model with anima byte-level decoder spec
# (anima ConsciousLM source โ€” see anima/docs/anima_clm_alm_origin_design_drift_archaeology_2026_05_05.md)

# Generate (byte-level)
input_bytes = "์•ˆ๋…•ํ•˜์„ธ์š”".encode("utf-8")
input_ids = torch.tensor([list(input_bytes)])  # uint8 โ†’ token ids 0-255
# forward + decode

Provenance

  • R2 bucket: anima-models (Cloudflare account ending 79bc)
  • R2 key: conscious-lm/convo-ft/convo_5k.pt
  • R2 last_modified: 2026-03-28T03:21:02Z
  • anima discovery: 2026-05-06 (clm_native_chat ฮฑ path PASS_R2_FOUND)
  • Recovery doc: docs/anima_clm_v2_chat_recovered_2026_05_06.ai.md

License

Apache-2.0 (anima open release).

Citation

@misc{anima_clm_v2_byte_18m_convo_5k_2026,
  title={CLM v2 byte-level 18M + 5K Korean dialogue fine-tune (anima ์ตœ์ดˆ chat-capable)},
  author={anima},
  year={2026},
  note={2026-03-28 origin, recovered 2026-05-06 from R2 anima-models bucket},
  howpublished={\url{https://huggingface.co/need-singularity/clm-v2-byte-18m-convo-5k}}
}

ํ•œ๊ธ€ ์š”์•ฝ

anima ์ตœ์ดˆ chat-capable ๋ชจ๋ธ ํšŒ๋ณต:

  • 2026-03-28 ConsciousLM v2 18M byte-level
  • byte vocab 256, 6 layers, 384 d_model, dual-head consciousness architecture
  • 5K KO dialogue fine-tune at step 45000
  • 2026-04-19 mac local ์‚ญ์ œ๋์ง€๋งŒ R2 backup ๋ณด์กด
  • 2026-05-06 anima R2 discovery๋กœ RECOVERED

๋‹ค์Œ ๋‹จ๊ณ„: Korean emit smoke ๊ฒ€์ฆ โ†’ ฮฑ path PASS_R2_FOUND ํ™•์ • โ†’ goal_reached.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support