Instructions to use Tralalabs/TralalabsLM-160M-Test with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Tralalabs/TralalabsLM-160M-Test with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Tralalabs/TralalabsLM-160M-Test")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Tralalabs/TralalabsLM-160M-Test")
model = AutoModelForCausalLM.from_pretrained("Tralalabs/TralalabsLM-160M-Test")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Tralalabs/TralalabsLM-160M-Test with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Tralalabs/TralalabsLM-160M-Test"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Tralalabs/TralalabsLM-160M-Test",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Tralalabs/TralalabsLM-160M-Test

SGLang

How to use Tralalabs/TralalabsLM-160M-Test with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Tralalabs/TralalabsLM-160M-Test" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Tralalabs/TralalabsLM-160M-Test",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Tralalabs/TralalabsLM-160M-Test" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Tralalabs/TralalabsLM-160M-Test",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Tralalabs/TralalabsLM-160M-Test with Docker Model Runner:
```
docker model run hf.co/Tralalabs/TralalabsLM-160M-Test
```

TralalabsLM-160M-Test

TralalabsLM-160M-Test is a small experimental causal language model trained as a smoke-test checkpoint for the TralalabsLM model family.

This is not a production model. It was trained for a very small test run to verify that the training script, Modal GPU environment, FineWeb-Edu streaming pipeline, Qwen2.5 tokenizer, saving, and downloading workflow all work correctly.

Model Details

Field	Value
Model name	TralalabsLM-160M-Test
Model family	TralalabsLM
Model type	Causal Language Model
Architecture	GPT-2 style decoder-only Transformer
Parameters	157,459,840
Rounded size	160M
Context length	2048 tokens
Tokenizer	Qwen/Qwen2.5-0.5B tokenizer
Training dataset	HuggingFaceFW/fineweb-edu
Training tokens	1,048,576 tokens
Optimizer updates	4
Framework	PyTorch + Transformers
Training platform	Modal.com
GPU used	NVIDIA L40S

Intended Use

This checkpoint is intended for:

verifying a training pipeline
testing model loading with transformers
checking tokenizer compatibility
debugging text generation
validating upload/download workflow
serving as a tiny TralalabsLM test artifact

It is not intended for serious assistant use, factual answering, coding help, production deployment, safety-critical use, or benchmark comparisons.

Training Data

This model was trained on a very small streamed sample from:

HuggingFaceFW/fineweb-edu

FineWeb-Edu is an educational-quality web dataset derived from FineWeb/Common Crawl filtering. The dataset is released under the Open Data Commons Attribution License, ODC-By v1.0, and its use is also subject to Common Crawl's Terms of Use.

Because this model only saw 1,048,576 tokens, it should be treated as a technical smoke-test checkpoint, not a meaningful pretrained language model.

Tokenizer

This model uses the tokenizer from:

Qwen/Qwen2.5-0.5B

Qwen2.5 uses a unified vocabulary with 151,665 total tokens, including control tokens for general text, chat, tool use, vision, and coding.

How to Load

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Tralalabs/TralalabsLM-160M-Test"

tokenizer = AutoTokenizer.from_pretrained(
    model_id,
    trust_remote_code=True,
)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
)

prompt = "The future of small language models is"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    **inputs,
    max_new_tokens=80,
    do_sample=True,
    temperature=0.8,
    top_p=0.95,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Expected Quality

This model is expected to produce low-quality, unstable, or repetitive text because it was trained for only 1M tokens.

The purpose of this release is pipeline validation, not final model quality.

Expected behavior:

may repeat text
may output broken grammar
may hallucinate
may fail at instructions
may produce nonsense
may show weak world knowledge
may not follow prompts reliably

Limitations

This model should not be used for:

medical, legal, financial, or safety-critical advice
factual answering
public chatbot deployment
moderation
autonomous agents
high-trust tasks
benchmark claims

The model may reproduce biases, toxicity, or unsafe patterns present in web-scale training data.

Training Configuration

Approximate configuration used for the smoke-test run:

n_embd: 640
n_layer: 12
n_head: 10
context_length: 2048
micro_batch_size: 8
gradient_accumulation_steps: 16
learning_rate: 3e-4
weight_decay: 0.10
dataset: HuggingFaceFW/fineweb-edu
tokenizer: Qwen/Qwen2.5-0.5B
tokens_seen: 1,048,576
update_steps: 4
parameters: 157,459,840

Environmental Impact

This was a short smoke-test run on a single Modal L40S GPU. No full-scale training run was performed for this checkpoint.

License

This model is released under the Apache License 2.0.

Attribution

Training data:

HuggingFaceFW/fineweb-edu

Tokenizer:

Qwen/Qwen2.5-0.5B

Citation

If you use this model, cite it as:

@misc{tralalabslm160mtest,
  title = {TralalabsLM-160M-Test},
  author = {Tralalabs},
  year = {2026},
  url = {https://huggingface.co/Tralalabs/TralalabsLM-160M-Test}
}

Status

This is an early experimental checkpoint.

Recommended next model:

TralalabsLM-160M-Base

trained on a much larger token budget.

Downloads last month: 14

Safetensors

Model size

0.2B params

Tensor type

F32

Dataset used to train Tralalabs/TralalabsLM-160M-Test

Collection including Tralalabs/TralalabsLM-160M-Test

TralalabsLM

Collection

Tralalabs's model family called "TralalabsLM". A collection. • 1 item • Updated May 20 • 1