Instructions to use enochlev/llm-toddler-30 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use enochlev/llm-toddler-30 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="enochlev/llm-toddler-30")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("enochlev/llm-toddler-30")
model = AutoModelForCausalLM.from_pretrained("enochlev/llm-toddler-30", device_map="auto")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use enochlev/llm-toddler-30 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "enochlev/llm-toddler-30"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "enochlev/llm-toddler-30",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/enochlev/llm-toddler-30

SGLang

How to use enochlev/llm-toddler-30 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "enochlev/llm-toddler-30" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "enochlev/llm-toddler-30",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "enochlev/llm-toddler-30" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "enochlev/llm-toddler-30",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio

How to use enochlev/llm-toddler-30 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for enochlev/llm-toddler-30 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for enochlev/llm-toddler-30 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for enochlev/llm-toddler-30 to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="enochlev/llm-toddler-30",
    max_seq_length=2048,
)

Docker Model Runner
How to use enochlev/llm-toddler-30 with Docker Model Runner:
```
docker model run hf.co/enochlev/llm-toddler-30
```

Toddler-LLM (fully pre-trained on CHILDES)

Overview

Model name: Toddler-LLM
Type: Decoder-only small LM for toddler-like dialogue
Status: Fully pre-trained from scratch on child-directed speech, then SFT + GRPO
Primary language: English
Target behavior: Coherent, short, child-like responses (approx. 2–3 years old)
Parameter count: ~155M (see config below)
Intended domain: Parent–child conversational exchanges

Model architecture

hidden_size: 672
intermediate_size: 1809
num_hidden_layers: 31
num_attention_heads: 12
num_key_value_heads: 4
max_position_embeddings: 256
vocab_size: 8192
lower-case-only tokenizer
tie_word_embeddings: true
rope_theta: 10000.0
max input length: 256 tokens

Training data

Source: CHILDES (filtered, English-only)
Approx. 14M tokens after filtering
Pretraining exclusively on child-directed speech (no large-scale adult corpora)
Data filtering (for downstream SFT/GRPO): caregiver utterance clarity via RM-4 to select top 10% “helpful” caregiver prompts; coherence scoring for child utterances via RM-2

Training procedure

Stage 1: Pre-training
- Library: Nanotron
- Objective: next-token prediction
- Steps: 25,000 (~64 epochs)
- Peak learning rate: 0.0025
- Loss convergence: just above 1.0
Stage 2: Chat SFT
- Adapted to SmolLM2 Instruct chat template and special tokens
- Library: unsloth (response-only SFT)
- Curriculum on increasingly higher-quality subsets (by RM-2 and RM-4):
  - Top 10%: LR 9e-4, 2 epochs
  - Top 5%: LR 8e-4, 14 epochs
  - Top 2.5%: LR 7e-4, 7 epochs
  - Top 1.25%: LR 6e-4, 3 epochs
- Reached stable response coherence around loss ≈ 0.45
Stage 3: GRPO optimization
- GRPO learning rate: 1e-5
- LoRA rank doubled vs. Step 1 (rank=128); target modules: [q,k,v,o,gate,up,down]
- Steps: 1000
- Reward weights: RM-1 (1.0), RM-2 (0.2), RM-3 (0.5)
- Selected best checkpoint by manual inspection for coherence + child-likeness

Reward models and data filters used

RM-1 (Toddler-BERT): BERT classifier for “child-like” style found @ enochlev/childish_behavior_model
RM-2 (Coherence-BERT): BERT classifier trained with soft labels for coherence
- Labeled by Llama‑3.3‑70B (batched for consistency), 0.0–1.0 scale found @ enochlev/child_coherence_model
- Training: 5 epochs, BCEWithLogitsLoss, LR 2e-5, weight decay 0.01, batch size 150, max length 96
RM-3 (Length PMF): Bayesian-based PMF over sentence lengths from CHILDES, min–max normalized to [0,1], temperature for smoothness; per-sentence score scaled by 1/max(1, number_of_punctuations) to encourage one short sentence
RM-4 (Caregiver clarity): LLM-scored question clarity; used as a filter only (not a reward) to select top 10% caregiver prompts

Inference and prompt format

Chat template: compatible with SmolLM2 Instruct-style templates
Guidance:
- Input: single caregiver question or brief prompt
- Output: one short, coherent sentence with age-appropriate vocabulary
Example prompt format (generic):
- System: “You are a 2–3-year-old child speaking in short, simple sentences.”
- User: “Caregiver: What did you have for dessert for lunch?”
- Assistant (model): “i had some spaghettis.” (Note: spelling/grammar may be age-typical)

Example generations (from evaluation set)

Caregiver Utterance: What did you have for dessert for lunch?
- Gold Child Response: a ice cream sandwich
- llm-toddler-30: i had some spaghettis.
Caregiver Utterance: What did we do with the toy?
- Gold Child Response: we put it in his cage
- llm-toddler-30: it broke.
Caregiver Utterance: Where is your bag?
- Gold Child Response: I didn't bring it. It's in the car
- llm-toddler-30: i got it
Caregiver Utterance: What did you find?
- Gold Child Response: look. Elmo
- llm-toddler-30: i found luke's lightsaber.
Caregiver Utterance: What did you put on the pizza?
- Gold Child Response: cheese and cheese
- llm-toddler-30: i put sause on it

Evaluation

Human evaluation (19 participants; two robot platforms: Cozmo, Misty II):
- Perceived age: ~3 years (closer to target for our model)
- Coherence: comparable across models; higher with Cozmo than Misty on average
- AoA and vocabulary: humans used broader vocabulary and higher AoA; models remained lower AoA as intended; some SmolLM variants occasionally produced adult-level content
- Notable: Participant expectations matched Cozmo’s child-like morphology/voice better than Misty’s

Intended use

Research on child-like conversational agents and human-robot interaction
Simulated child responses to caregiver prompts

Out-of-scope and limitations

Not for clinical, diagnostic, educational placement, or childcare decision-making
English-only; small corpus (≈14M tokens); limited world knowledge
Can produce off-context, random child-like words; may fixate on certain “baby words”
May generate age-inappropriate content in rare cases; monitor outputs
Sensitive to prompt phrasing; best with concise caregiver questions

Safety and ethical considerations

Use responsibly around minors; ensure adult supervision in interactive settings
Avoid anthropomorphizing beyond research context
Respect CHILDES data licenses and privacy norms
Models may reflect biases or artifacts from child-directed corpora

Downloads last month: 7

Safetensors

Model size

0.2B params

Tensor type

BF16