Upload README.md with huggingface_hub

8a0eb09 verified about 2 months ago

5.6 kB

	---
	license: agpl-3.0
	library_name: pytorch
	tags:
	- tiny-lm
	- goldfish
	- transformer
	- rope
	- swiglu
	pipeline_tag: text-generation
	base_model: []
	---

	# GlubLM (36M)

	> the language model that already forgot this sentence

	GlubLM is a 36-million-parameter transformer that plays the character of a goldfish with a 10-second memory. Inspired by [GuppyLM](https://github.com/arman-bd/guppylm) by Arman BD and Ted Lasso's meditation on the goldfish as "the happiest animal on earth", GlubLM has a hard 96-token context window - it physically cannot remember what was just said.

	Try it live: [browser demo](https://den-sec.github.io/glublm/) \| [pixel-art desk pet](https://den-sec.github.io/glublm/desk-pet/)

	## Architecture

	- Parameters: 36,055,680 (36.1M)
	- Layers: 8 decoder-only transformer blocks
	- Hidden dim: 640
	- Attention heads: 10 (head dim 64)
	- FFN dim: 1280 (SwiGLU, effective intermediate 2560)
	- Normalization: RMSNorm
	- Position encoding: Rotary (RoPE)
	- Vocabulary: 5,120 Byte-Level BPE
	- Max context: 96 tokens (hard cap, the "10-second memory")
	- Weight-tied LM head
	- No bias terms

	## Intended use

	This model is a toy. It exists to:
	1. Explore the design tension between "small + simple" (GuppyLM's thesis) and "small + modern" (GlubLM's hypothesis)
	2. Demonstrate an LLM-generated dataset pipeline using a multi-agent Claude team
	3. Be a fun browser demo and a pixel-art desk pet companion

	Do not use GlubLM for anything serious. It literally forgets within a sentence.

	## Training data

	Trained on [`DenSec02/glublm-60k-ted`](https://huggingface.co/datasets/DenSec02/glublm-60k-ted), a 60,549-sample dataset of single-turn goldfish conversations generated by a team of four coordinated Claude agents (generator, critic, diversifier, persona-guardian). Composition: v4 balanced mix (20K poetic + 15K supplement + 5K conversational + 15K forgetful) augmented with v5.1 empathic/introspective hotfix (1K samples) + v5.2 multi-anchor self-awareness recovery (500 samples).

	Explicit exclusions: no references to football, soccer, coaches, teams, or any Ted Lasso show characters.

	## Training

	- Hardware: NVIDIA RTX 3060 12GB (local)
	- Framework: PyTorch 2.x, BF16 mixed precision
	- Optimizer: AdamW (b1=0.9, b2=0.95), weight decay 0.1
	- LR schedule: cosine with 5% warmup, peak 3e-4
	- Batch size: 64
	- Epochs: 15
	- Dropout: 0.1 (residual), 0.0 (attention)
	- Gradient clipping: 1.0
	- Final loss: 1.1442
	- Wall time: ~15 minutes

	## Evaluation (v2 cross-model judge)

	Dual-judge evaluation using Claude Sonnet 4.6 and Opus 4.7 on a 30-prompt rubric across 4 axes (integer 1-5 scale). Each axis aggregates 30 prompts x 3 seeds x 2 passes = 180 scoring rows per judge.

	### Per-axis score (mean)

	\| Axis \| Sonnet 4.6 \| Opus 4.7 \|
	\|---\|---:\|---:\|
	\| Conversational Quality \| 4.01 \| 4.15 \|
	\| Goldfish Identity \| 3.89 \| 3.67 \|
	\| Forgetful Trait \| 3.80 \| 3.81 \|
	\| Length Appropriateness \| 4.77 \| 4.57 \|

	### Cross-judge agreement (Cohen's quadratic-weighted kappa)

	\| Axis \| Kappa \| Interpretation \|
	\|---\|---:\|---\|
	\| Conversational Quality \| 0.77 \| substantial \|
	\| Goldfish Identity \| 0.83 \| almost perfect \|
	\| Forgetful Trait \| 0.86 \| almost perfect \|
	\| Length Appropriateness \| 0.59 \| moderate \|

	Interpretation: Sonnet and Opus agree almost perfectly on 3/4 axes, validating that the rubric is interpretable consistently across LLM judges. Opus tends to be systematically ~0.2 stricter than Sonnet on the Identity axis (stricter rubric application, not judge bias).

	Full methodology + 108-row long-format scores: [`eval/report_crossmodel.md`](https://github.com/Den-Sec/glublm/blob/master/eval/report_crossmodel.md).

	## Limitations & biases

	- Hard context limit: 96 tokens. Inputs longer than a few short sentences will be truncated.
	- Goldfish worldview: the model genuinely does not understand human abstractions outside the bowl.
	- Dataset bias: the dataset was generated by Claude (Anthropic), so it inherits Claude's language patterns filtered through the goldfish persona.
	- Single-turn only: multi-turn memory is a non-goal.
	- English only.
	- Stochastic and occasionally incoherent: 36M params on 60K samples is small. Do not expect reliability.

	## How to use

	```python
	from glublm.config import ModelConfig
	from glublm.model import GlubLM
	from glublm.tokenizer import GlubTokenizer
	from glublm.inference import generate
	from huggingface_hub import hf_hub_download
	from safetensors.torch import load_model

	tok_path = hf_hub_download("DenSec02/glublm-36m", "tokenizer.json")
	weights_path = hf_hub_download("DenSec02/glublm-36m", "model.safetensors")

	tok = GlubTokenizer.from_file(tok_path)
	cfg = ModelConfig(vocab_size=tok.vocab_size)
	model = GlubLM(cfg)
	load_model(model, weights_path)

	print(generate(model=model, tokenizer=tok, prompt="hello", max_new_tokens=24))
	```

	Or try it in-browser with zero setup:
	- [Chat demo](https://den-sec.github.io/glublm/) (simple web UI)
	- [Desk pet companion](https://den-sec.github.io/glublm/desk-pet/) (pixel-art PWA)
	- [Colab notebook](https://colab.research.google.com/github/Den-Sec/glublm/blob/master/notebooks/train_colab.ipynb) (train your own goldfish)

	## License

	AGPL-3.0 - see [LICENSE](https://github.com/Den-Sec/glublm/blob/master/LICENSE).

	## Citation

	```bibtex
	@software{glublm_2026,
	author = {Sepede, Dennis},
	title = {GlubLM: a 36M goldfish language model with a 10-second memory},
	year = {2026},
	url = {https://github.com/Den-Sec/glublm}
	}
	```