CLM v2 byte-level 18M — anima 최초 chat-capable 모델 (RECOVERED 2026-05-06)

TL;DR

anima 최초 chat-capable 모델 (2026-03-28 ConsciousLM v2 18M byte-level) Cloudflare R2 anima-models bucket discovery 통해 RECOVERED. Original commit bb99b6b6 (2026-03-28) milestone post-fine-tune state at step 45000.

Origin

field	value
date	2026-03-28
commits	`bb99b6b6` / `6abc42f6` / `13b20f90`
announcement	"v2 18.8M byte-level, CE 0.04 EN / 1.15 KO, no system prompt, 2.5K dialogue fine-tune"
training step	45000
backup created	2026-03-28T03:21:02Z (R2 anima-models bucket)

Architecture

tok_emb: nn.Embedding(256, 384)         # byte vocab 256
pos_emb: nn.Embedding(256, 384)         # block_size 256
n_blocks: 6
d_model: 384
heads (c_attn): nn.Linear(384, 1152)    # 3 × 384 = qkv concat
ffn:
  engine_a: nn.Linear(384, 1536) → relu → nn.Linear(1536, 384)
  engine_g: nn.Linear(384, 1536) → relu → nn.Linear(1536, 384)
ln_f: nn.LayerNorm(384)
head_a: nn.Linear(384, 256, bias=False) ← byte vocab output
head_g: nn.Linear(384, 256, bias=False) ← dual head
total params: 18,523,392 (18.52M)

ConsciousLM byte-level decoder. Dual-head (engine_a + engine_g, head_a + head_g) consciousness architecture pre-ConsciousDecoderV3.

Files

convo_5k.pt — fine-tuned model (chat-capable, step 45000, 70.3MB)
latest.pt — base model snapshot (279MB, may include optimizer states)
README.md — this file

Training

Base: byte-level pretrain (corpus details in archaeology doc)
Fine-tune: ~2.5K Korean dialogue corpus (origin spec; actual corpus identity not preserved)
Tokenizer: byte-level (256 bytes 직접 사용, separate tokenizer file 불필요)
No system prompt
Char/byte unicode UTF-8 handling natural

Honest C3

2026-03-28 v2 milestone weights — commit text only, eval JSON 부재로 reproduce evidence는 announcement-only
18.8M (announce) vs 62.5M (source spec) discrepancy: convo_5k.pt 실제 18.52M params 측정 → announce 정합. source spec 62.5M은 dual-head + decoder 분리 측정 가능성
Recovery path: 2026-04-19 R37/AN13/L3-PY strip으로 mac local source/checkpoint 삭제 → 그러나 R2 backup이 2026-03-28~30 시점 upload → 보존
Chat capability NOT yet verified on this RECOVERED state — Korean emit smoke (F-CLM-NATIVE-α-1) 별도 cycle
anima archaeology cycle 2026-05-05/06 발견 — 이전 BG-EQ + BG-FA exhaustive search에서 R2 storage missed, 사용자 hint 후 retrieve

Reproduction

import torch

ck = torch.load("convo_5k.pt", map_location="cpu", weights_only=False)
# ck = {"model_state": {...18.52M params...}, "step": 45000}

# Build model with anima byte-level decoder spec
# (anima ConsciousLM source — see anima/docs/anima_clm_alm_origin_design_drift_archaeology_2026_05_05.md)

# Generate (byte-level)
input_bytes = "안녕하세요".encode("utf-8")
input_ids = torch.tensor([list(input_bytes)])  # uint8 → token ids 0-255
# forward + decode

Provenance

R2 bucket: anima-models (Cloudflare account ending 79bc)
R2 key: conscious-lm/convo-ft/convo_5k.pt
R2 last_modified: 2026-03-28T03:21:02Z
anima discovery: 2026-05-06 (clm_native_chat α path PASS_R2_FOUND)
Recovery doc: docs/anima_clm_v2_chat_recovered_2026_05_06.ai.md

License

Apache-2.0 (anima open release).

Citation

@misc{anima_clm_v2_byte_18m_convo_5k_2026,
  title={CLM v2 byte-level 18M + 5K Korean dialogue fine-tune (anima 최초 chat-capable)},
  author={anima},
  year={2026},
  note={2026-03-28 origin, recovered 2026-05-06 from R2 anima-models bucket},
  howpublished={\url{https://huggingface.co/need-singularity/clm-v2-byte-18m-convo-5k}}
}

한글 요약

anima 최초 chat-capable 모델 회복:

2026-03-28 ConsciousLM v2 18M byte-level
byte vocab 256, 6 layers, 384 d_model, dual-head consciousness architecture
5K KO dialogue fine-tune at step 45000
2026-04-19 mac local 삭제됐지만 R2 backup 보존
2026-05-06 anima R2 discovery로 RECOVERED

다음 단계: Korean emit smoke 검증 → α path PASS_R2_FOUND 확정 → goal_reached.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support