Instructions to use enochlev/llm-toddler-30 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use enochlev/llm-toddler-30 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="enochlev/llm-toddler-30") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("enochlev/llm-toddler-30") model = AutoModelForCausalLM.from_pretrained("enochlev/llm-toddler-30") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use enochlev/llm-toddler-30 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "enochlev/llm-toddler-30" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "enochlev/llm-toddler-30", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/enochlev/llm-toddler-30
- SGLang
How to use enochlev/llm-toddler-30 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "enochlev/llm-toddler-30" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "enochlev/llm-toddler-30", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "enochlev/llm-toddler-30" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "enochlev/llm-toddler-30", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use enochlev/llm-toddler-30 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for enochlev/llm-toddler-30 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for enochlev/llm-toddler-30 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for enochlev/llm-toddler-30 to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="enochlev/llm-toddler-30", max_seq_length=2048, ) - Docker Model Runner
How to use enochlev/llm-toddler-30 with Docker Model Runner:
docker model run hf.co/enochlev/llm-toddler-30
Toddler-LLM (fully pre-trained on CHILDES)
Overview
- Model name: Toddler-LLM
- Type: Decoder-only small LM for toddler-like dialogue
- Status: Fully pre-trained from scratch on child-directed speech, then SFT + GRPO
- Primary language: English
- Target behavior: Coherent, short, child-like responses (approx. 2–3 years old)
- Parameter count: ~155M (see config below)
- Intended domain: Parent–child conversational exchanges
Model architecture
- hidden_size: 672
- intermediate_size: 1809
- num_hidden_layers: 31
- num_attention_heads: 12
- num_key_value_heads: 4
- max_position_embeddings: 256
- vocab_size: 8192
- lower-case-only tokenizer
- tie_word_embeddings: true
- rope_theta: 10000.0
- max input length: 256 tokens
Training data
- Source: CHILDES (filtered, English-only)
- Approx. 14M tokens after filtering
- Pretraining exclusively on child-directed speech (no large-scale adult corpora)
- Data filtering (for downstream SFT/GRPO): caregiver utterance clarity via RM-4 to select top 10% “helpful” caregiver prompts; coherence scoring for child utterances via RM-2
Training procedure
- Stage 1: Pre-training
- Library: Nanotron
- Objective: next-token prediction
- Steps: 25,000 (~64 epochs)
- Peak learning rate: 0.0025
- Loss convergence: just above 1.0
- Stage 2: Chat SFT
- Adapted to SmolLM2 Instruct chat template and special tokens
- Library: unsloth (response-only SFT)
- Curriculum on increasingly higher-quality subsets (by RM-2 and RM-4):
- Top 10%: LR 9e-4, 2 epochs
- Top 5%: LR 8e-4, 14 epochs
- Top 2.5%: LR 7e-4, 7 epochs
- Top 1.25%: LR 6e-4, 3 epochs
- Reached stable response coherence around loss ≈ 0.45
- Stage 3: GRPO optimization
- GRPO learning rate: 1e-5
- LoRA rank doubled vs. Step 1 (rank=128); target modules: [q,k,v,o,gate,up,down]
- Steps: 1000
- Reward weights: RM-1 (1.0), RM-2 (0.2), RM-3 (0.5)
- Selected best checkpoint by manual inspection for coherence + child-likeness
Reward models and data filters used
- RM-1 (Toddler-BERT): BERT classifier for “child-like” style found @ enochlev/childish_behavior_model
- RM-2 (Coherence-BERT): BERT classifier trained with soft labels for coherence
- Labeled by Llama‑3.3‑70B (batched for consistency), 0.0–1.0 scale found @ enochlev/child_coherence_model
- Training: 5 epochs, BCEWithLogitsLoss, LR 2e-5, weight decay 0.01, batch size 150, max length 96
- RM-3 (Length PMF): Bayesian-based PMF over sentence lengths from CHILDES, min–max normalized to [0,1], temperature for smoothness; per-sentence score scaled by 1/max(1, number_of_punctuations) to encourage one short sentence
- RM-4 (Caregiver clarity): LLM-scored question clarity; used as a filter only (not a reward) to select top 10% caregiver prompts
Inference and prompt format
- Chat template: compatible with SmolLM2 Instruct-style templates
- Guidance:
- Input: single caregiver question or brief prompt
- Output: one short, coherent sentence with age-appropriate vocabulary
- Example prompt format (generic):
- System: “You are a 2–3-year-old child speaking in short, simple sentences.”
- User: “Caregiver: What did you have for dessert for lunch?”
- Assistant (model): “i had some spaghettis.” (Note: spelling/grammar may be age-typical)
Example generations (from evaluation set)
- Caregiver Utterance: What did you have for dessert for lunch?
- Gold Child Response: a ice cream sandwich
- llm-toddler-30: i had some spaghettis.
- Caregiver Utterance: What did we do with the toy?
- Gold Child Response: we put it in his cage
- llm-toddler-30: it broke.
- Caregiver Utterance: Where is your bag?
- Gold Child Response: I didn't bring it. It's in the car
- llm-toddler-30: i got it
- Caregiver Utterance: What did you find?
- Gold Child Response: look. Elmo
- llm-toddler-30: i found luke's lightsaber.
- Caregiver Utterance: What did you put on the pizza?
- Gold Child Response: cheese and cheese
- llm-toddler-30: i put sause on it
Evaluation
- Human evaluation (19 participants; two robot platforms: Cozmo, Misty II):
- Perceived age: ~3 years (closer to target for our model)
- Coherence: comparable across models; higher with Cozmo than Misty on average
- AoA and vocabulary: humans used broader vocabulary and higher AoA; models remained lower AoA as intended; some SmolLM variants occasionally produced adult-level content
- Notable: Participant expectations matched Cozmo’s child-like morphology/voice better than Misty’s
Intended use
- Research on child-like conversational agents and human-robot interaction
- Simulated child responses to caregiver prompts
Out-of-scope and limitations
- Not for clinical, diagnostic, educational placement, or childcare decision-making
- English-only; small corpus (≈14M tokens); limited world knowledge
- Can produce off-context, random child-like words; may fixate on certain “baby words”
- May generate age-inappropriate content in rare cases; monitor outputs
- Sensitive to prompt phrasing; best with concise caregiver questions
Safety and ethical considerations
- Use responsibly around minors; ensure adult supervision in interactive settings
- Avoid anthropomorphizing beyond research context
- Respect CHILDES data licenses and privacy norms
- Models may reflect biases or artifacts from child-directed corpora
- Downloads last month
- 3