Instructions to use Bochkov/emergent-semantics-model-unfrozen-335m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Bochkov/emergent-semantics-model-unfrozen-335m with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Bochkov/emergent-semantics-model-unfrozen-335m", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Bochkov/emergent-semantics-model-unfrozen-335m", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Bochkov/emergent-semantics-model-unfrozen-335m with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Bochkov/emergent-semantics-model-unfrozen-335m" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Bochkov/emergent-semantics-model-unfrozen-335m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Bochkov/emergent-semantics-model-unfrozen-335m
- SGLang
How to use Bochkov/emergent-semantics-model-unfrozen-335m with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Bochkov/emergent-semantics-model-unfrozen-335m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Bochkov/emergent-semantics-model-unfrozen-335m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Bochkov/emergent-semantics-model-unfrozen-335m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Bochkov/emergent-semantics-model-unfrozen-335m", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Bochkov/emergent-semantics-model-unfrozen-335m with Docker Model Runner:
docker model run hf.co/Bochkov/emergent-semantics-model-unfrozen-335m
Emergent Semantics — Model_UNFROZEN (335M) (Baseline)
This repository provides Model_UNFROZEN (335M) — a decoder-only Transformer language model trained in the standard setup with trainable input token embeddings.
It is released as the baseline for the paper:
📚 Paper (Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate) -
Primary goal: enable a controlled comparison against the frozen-embedding variantBochkov/emergent-semantics-model-uni-glyph-335m under identical architecture, tokenizer, and training regime.
What this model is (and is not)
Model_UNFROZEN is a conventional Transformer LM where:
- the token embedding matrix is randomly initialized and trained end-to-end
- the rest of the Transformer is trained normally
This model exists to isolate the effect of freezing / changing the embedding layer.
It is not intended to be a best-performing standalone model.
Model summary
- Architecture: decoder-only Transformer (GPT-like)
- Hidden size (
d_model): 1024 - Layers: 16
- Heads: 32
- Positional encoding: rotary embeddings
- Activation: GELU
- Input embeddings: trainable (standard
nn.Embedding) - Output head: not tied to the input embeddings (trained separately)
- Vocabulary size: 65,536
- Tokenizer:
Bochkov/bvv241-2-3
Intended use
This model is intended for:
- baseline comparisons in research on emergent semantics
- measuring the effect of frozen vs trainable embeddings
- ablations and reproducibility checks for the associated paper
Not intended for production deployment. It is a research artifact trained under constrained compute/data to enable controlled comparisons.
How to use (Transformers)
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Bochkov/emergent-semantics-model-unfrozen-335m")
model = AutoModelForCausalLM.from_pretrained("Bochkov/emergent-semantics-model-unfrozen-335m", trust_remote_code=True).to('cuda')
inputs = torch.tensor([tokenizer.encode("Question: What is the capital of Japan?\nAnswer:")], dtype=torch.long, device='cuda')
outputs = model.generate(
inputs,
max_new_tokens=10,
do_sample=False
)
print(tokenizer.decode(outputs[0].tolist()))
#Question: What is the capital of Japan?
#Answer:Tokyo Metropolitan
Training overview (high level)
- Training data: multilingual Wikipedia subsets + a small portion of SFT-style QA data (see paper)
- Scale: ~4B tokens (resource-constrained setting for controlled comparisons)
- Hardware: H100 80GB (reported setup)
Related repositories
- Paper model collection:
https://huggingface.co/collections/Bochkov/emergent-semantics-beyond-token-embeddings - Frozen embedding counterpart (main experimental model):
https://huggingface.co/Bochkov/emergent-semantics-model-uni-glyph-335m - Tokenizer:
https://huggingface.co/Bochkov/bvv241-2-3 - Code (GitHub):
https://github.com/AVBochkov/Embeddings
🧑🔬 Citation & Concept
If you use this model or the underlying concepts in your research, please cite our work:
@article{
bochkov2025emergent,
title={Emergent Semantics Beyond Token Embeddings: Transformer {LM}s with Frozen Visual Unicode Representations},
author={Andrey Bochkov},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2025},
url={https://openreview.net/forum?id=Odh8IynO1o},
note={}
}
@misc{bochkov2025growingtransformersmodularcomposition,
title={Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate},
author={A. Bochkov},
year={2025},
eprint={2507.07129},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2507.07129},
}
- Downloads last month
- 1