Transformers grown layer-by-layer on frozen embeddings. Explores emergent capabilities with depth.
Andrey
Bochkov
AI & ML interests
None yet
Organizations
None yet
Progressive Growth Transformers (PGT) [pretrain]
Transformers grown layer-by-layer on frozen embeddings. Explores emergent capabilities with depth.
Best demo models [pretrain]
Frozen embedding LMs (en/ru/zh) & their MoE fusion. Baselines: frozen vs unfrozen embedding ablation.
Tokenizers
This collection features frozen, precomputed token embedding tensors designed for experimentation with semantic emergence in language models.