edeneldith
/

COLM

+---
+license: mit
+language:
+  - en
+tags:
+  - complex-valued
+  - oscillating-neurons
+  - language-model
+  - autoregressive
+  - character-level
+  - linear-time
+  - pytorch
+  - from-scratch
+datasets:
+  - edeneldith/DCDM
+pipeline_tag: text-generation
+library_name: pytorch
+---
+# COLM — Complex Oscillating Language Model
+> **Paper:** [Zenodo (PDF)](https://doi.org/10.5281/zenodo.XXXXXXX) |
+> **Code:** [GitHub](https://github.com/Eden-Eldith/COLM) |
+> **Dataset:** [edeneldith/DCDM](https://huggingface.co/datasets/edeneldith/DCDM) |
+> **Predecessor:** [WiggleGPT (Zenodo)](https://doi.org/10.5281/zenodo.17919011)
+**Author:** Phillip C. O'Brien — ORCID [0009-0007-3961-1182](https://orcid.org/0009-0007-3961-1182)
+## What is COLM?
+COLM is a novel autoregressive language model that operates entirely in the complex number plane using oscillatory neurons. It replaces the transformer's quadratic-complexity self-attention with an O(N) causal recurrence driven by complex-valued gates, and replaces all learned linear transformations in its core blocks with fixed unitary rotations and element-wise complex oscillatory activations.
+**Zero `nn.Linear` layers in the processing blocks** — all transformation is performed by the oscillating activation `sin(W * Z + B) * tanh(Z)` where `W, B` are complex-valued, routed through fixed energy-preserving complex mixers.
+## Key Results
+| Metric | Value |
+|--------|-------|
+| **Parameters** | 498,214 |
+| **Best validation loss** | 1.1449 |
+| **Creativity score** (GPT-5.4 blind eval) | 4.83 / 10 |
+| **Age group estimate** | 84% rated age 13-16 |
+| **Training time** | 8.7 hours |
+| **Hardware** | Single RTX 5060 Ti 16GB |
+| **Tokenizer** | 499-token word+character hybrid |
+| **Domain** | Theological-philosophical prose |
+At 498k parameters — roughly half the size of TinyStories' smallest coherent model — COLM generates thematically coherent philosophical prose at temperature 1 with no spell correction.
+## Architecture
+| Component | COLM |
+|-----------|------|
+| State | Native `torch.cfloat` throughout |
+| Activation | `sin(W * Z + B) * tanh(Z)`, complex W, B |
+| Sequence routing | O(N) causal recurrence via `torch.cumsum` |
+| MLP/FFN | Fixed unitary mixer -> Oscillator -> mixer -> Oscillator |
+| Residual | Complex sinc resonance coupling |
+| Normalisation | ComplexRMSNorm (phase-preserving) |
+| Sparsity | Learnable sigmoidal gate on magnitude |
+## Model Configuration
+```json
+{
+  "n_embd": 324,
+  "n_layer": 16,
+  "embed_dim": 66,
+  "block_size": 128,
+  "vocab_size": 499
+}
+```
+## Files
+| File | Description |
+|------|-------------|
+| `colm_best_Final.pt` | Best checkpoint (step 860,000, val loss 1.1449) |
+| `colm_config.json` | Full training and architecture configuration |
+| `colm_tokenizer.json` | 499-token word+character hybrid tokenizer vocabulary |
+| `model.py` | All `nn.Module` classes needed to load the model |
+## Usage
+```python
+import torch
+import json
+from model import COLM
+# Load config
+with open("colm_config.json") as f:
+    config = json.load(f)
+arch = config["architecture"]
+model = COLM(
+    vocab_size=arch["vocab_size"],
+    n_embd=arch["n_embd"],
+    n_layer=arch["n_layer"],
+    block_size=arch["block_size"],
+    embed_dim=arch["embed_dim"],
+)
+# Load weights
+checkpoint = torch.load("colm_best_Final.pt", map_location="cpu")
+model.load_state_dict(checkpoint["model_state_dict"])
+model.eval()
+```
+See the [GitHub repository](https://github.com/Eden-Eldith/COLM) for full training, generation, and evaluation scripts.
+## Training Data
+Trained on the [DCDM dataset](https://huggingface.co/datasets/edeneldith/DCDM) — 47 million tokens of synthetic theological-philosophical prose generated from 93 public domain works through a locally-run Gemma 3 12B pipeline.
+## Limitations
+- **Spelling:** The 499-token vocabulary means most words are assembled from character tokens, producing spelling variation
+- **Single domain:** Trained only on theological-philosophical text; cross-domain performance is untested
+- **Batch size:** Final run used batch_size=4 rather than intended 32 — results are a lower bound
+## Citation
+```bibtex
+@misc{obrien2026colm,
+  author = {O'Brien, Phillip C.},
+  title = {COLM: Complex Oscillating Language Model — Coherent Language from Sub-500k Parameter Oscillatory Models},
+  year = {2026},
+  publisher = {Zenodo},
+  url = {https://github.com/Eden-Eldith/COLM}
+}
+```
+## Licence
+MIT License. Copyright (c) 2025-2026 Phillip C. O'Brien.