--- license: mit tags: - nGPT - ascii-art - character-language-model - model-merging - evolutionary-merge library_name: pytorch pipeline_tag: text-generation --- # darwASCIInGPT — nGPT ASCII-art artists + Darwin-bred children Char-level **nGPT** (Normalized GPT) checkpoints from the *darwASCIInGPT* experiments: small hypersphere transformers that draw ASCII art, plus the **Darwin-style bred offspring** produced by merging them with **no gradient training**. Companion knowledge base (observations, code, Spark setup): > **GitHub:** https://github.com/tinycrops/darwASCIInGPT-playbook All ASCII models are `dim 256 / depth 4` (~3.18M params), char vocab ~106–109, trained on the **apehex** hand-drawn ASCII corpus on a Quadro P4000. The enwik8 text models are `dim 256–512 / depth 8`, trained on a GTX 1060. ## Special tokens (char-level) `SOL = \x02`, `SEP = \x03`, `EOA = \x04`. Two framings: | Framing | Prime with | Use | |---|---|---| | **Conditional** ` label art ` | `` + label + `` | request a class (e.g. `Cats`, `Swords`) | | **Unconditional** ` art ` | `` | free-form draw (no label channel) | These models are trained to **very low loss (near-memorization)**, so: `T≈0.6, top_k≈20` → clean complete drawings; `top_k=1` → one fixed canonical piece per prefix; higher `T` → more variety with occasional whitespace drift. ## Contents | Path | Type | Framing | Trained on / notes | |---|---|---|---| | `uncond/styleA` | artist | unconditional | apehex **creatures & nature** half. final stream_loss **0.031** (99.2% acc) | | `uncond/styleB` | artist | unconditional | apehex **objects & tech** half. final stream_loss **0.089** (97.5% acc) | | `apehex/styleA` | artist | conditional | GROUP_A subcategories (Cats, Dragons, Flowers, …) | | `apehex/styleB` | artist | conditional | GROUP_B subcategories (Swords, Cars, Robots, …) | | `apehex/breed/child_slerp` | **bred** | conditional | SLERP merge of styleA × styleB on the nGPT hypersphere | | `apehex/breed/child_slerp_frozenattn` | **bred** | conditional | attention frozen from one parent, FFN SLERP-blended | | `parents/domA`, `parents/domB` | artist | conditional | domain split: apehex art vs mrzjy sample | | `parents/breed/child_slerp`, `child_discrete`, `child_slerp_frozenattn` | **bred** | conditional | recombinations of domA × domB | | `smith-experiment/ngpt` | ablation | conditional | nGPT normalized-lerp residual. val_loss **1.0736** | | `smith-experiment/smith` | ablation | conditional | Möbius geodesic residual (matched init/data/schedule). val_loss **1.0788**, ~1.57× slower | | `resonance/standard_g1`, `harmonic30_g1` | sweep | conditional | resonance-geometry sweep representatives | | `enwik8-darwin/offspring_forkL__x__forkR.pt` | **bred** | enwik8 text | the hybrid-vigor offspring: **bpc 2.4636 vs best parent 2.5047 (+0.0412)** | | `enwik8-darwin/darwin_log.json` | log | — | shared-ancestor breeding → vigor | | `enwik8-darwin/darwin_log_independent.json` | log | — | independent-init breeding → **no** vigor (control) | ## Headline result: genealogy decides hybrid vigor Identical SLERP breeder, different parent *relationship* (enwik8 bpc, lower better): | Parents | Origin | Gen-0 child | Champion | Best parent | Vigor? | |---|---|---|---|---|---| | independent inits | different basins | 3.26 | 2.3064 | 2.3063 | **No** | | shared ancestor, split data | same basin | 2.47 | **2.4633** | 2.5047 | **Yes (+0.041)** | Crossbreeding only works between **mode-connected** parents (shared ancestor, specialized differently). See the GitHub `docs/darwin-breeding.md`. ## Loading ```python import torch from nGPT_pytorch import nGPT import ngpt_patch # restore __hash__ on nGPT modules; import BEFORE constructing ck = torch.load("uncond/styleA/model.pt", map_location="cuda", weights_only=False) model = nGPT(**ck["config"]).cuda(); model.load_state_dict(ck["model"]); model.eval() stoi, itos = ck["stoi"], ck["itos"] # see code/sample.py in the GitHub repo for the full conditional/unconditional sampler ``` > Checkpoints with `variant == "smith"` need the `SmithResidual` swap before > construction (see `train_compare.make_model` in the source lab). ## Example — `uncond/styleA` (unconditional, T=0.6) ``` _ ( ) \ ( ) ) \ /\) (/\ \ /` ` | dlb ``` Per-checkpoint sample galleries are in the GitHub repo under `galleries/`. ## Provenance & license Models are derived from the apehex / mrzjy ASCII-art corpora and enwik8. Released MIT for the model weights and code; original ASCII art belongs to its respective artists (signatures like `dlb`, `jgs`, `sjw`, `ejm` are preserved in outputs).