Update README.md

9b021de verified 2 days ago

4.59 kB

language:
  - en
license: apache-2.0
tags:
  - causal-lm
  - reasoning
  - thought-experiments
  - chain-of-thought
  - sft
  - dpo
  - alignment
  - small-language-model
  - custom-architecture
base_model: tensorfiend/DotLM-165M
datasets:
  - tensorfiend/SimpleThoughts
pipeline_tag: text-generation
library_name: transformers

DotLM

DotLM is a minimal 165M parameter model, from-scratch transformer trained entirely on the SimpleThoughts dataset. It uses explicit <think>...</think> chain-of-thought traces to reason through intuitive physics, logic, causal inference, and other everyday phenomena before producing an answer.

Model Details

Architecture

Parameter	Value
Parameters	~165M
Layers	24
Model dimension	768
FFN hidden dim	2048 (SwiGLU)
Attention heads	6
KV heads (GQA)	2
Head dimension	128
Context length	4096 tokens
Vocabulary size	16,384 (BPE)
Positional encoding	RoPE (θ = 10,000)
Normalization	RMSNorm (ε = 1e-6)
Tied embeddings	Yes

Key design choices: Grouped-Query Attention (GQA) with 3:1 head ratio for efficient KV memory, SwiGLU activations, pre-norm architecture, and bf16 mixed-precision training throughout.

Training Pipeline

The model was trained sequentially across four stages using the DotLM framework:

Stage	Dataset	Samples	Objective
Pretraining	SimpleThoughts/pretrain	352,214	Next-token prediction
SFT	SimpleThoughts/sft	25,788	ChatML instruction following
Alignment	SimpleThoughts/alignment	7,172	Reference-free DPO (SimPO-style)
Reasoning	SimpleThoughts/reasoning	6,300	Chain-of-thought with `<think>` traces

Special Tokens

Token	Purpose
`<\|im_start\|>`	Start of turn (BOS)
`<\|im_end\|>`	End of turn
`<think>`	Begin reasoning trace
`</think>`	End reasoning trace
`<endoftext>`	End of sequence (EOS)
`<pad>`	Padding

Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

repo_id = "tensorfiend/DotLM-165M"
device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    trust_remote_code=True,
    torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
).to(device)

user_query = "If a ball is placed inside a box and the box is sealed, where is the ball?"

prompt = f"<|im_start|>user\n{user_query}<|im_end|>\n<|im_start|>assistant\n<think>"

inputs = tokenizer(prompt, return_tensors="pt").to(device)

outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_k=50,
    do_sample=True,
    eos_token_id=tokenizer.eos_token_id,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=False))

Prompt Format

DotLM uses the ChatML format with an explicit reasoning prefix:

<|im_start|>user
{your question}<|im_end|>
<|im_start|>assistant
<think>
{model reasons here}
</think>
{final answer}

Performance & Limitations

Scale: At 165M parameters, DotLM is a research-scale model. It is not competitive with large-scale LLMs on general benchmarks.
Domain: The model is specialized on thought experiments — intuitive physics, causal reasoning, spatial reasoning, theory of mind, and related domains. It may underperform on unrelated topics.
Reasoning quality: The chain-of-thought traces are coherent on in-distribution thought experiments but may hallucinate or ramble on out-of-distribution inputs.
Context: Maximum context length is 4,096 tokens.
Safety: No RLHF safety training was applied. Not suitable for deployment in user-facing products without additional safety measures.

Training Details

Checkout the blog for training details: DotLM - An end-to-end trained 165M model (coming soon)

Related Resources

Dataset: SimpleThoughts
Training code: DotLM (coming soon)

Citation

@misc{dotlm2026, author = {Shanmukh}, title = {DotLM-165M: A Minimal Reasoning Language Model Trained on Thought Experiments}, year = {2026}, publisher = {Hugging Face}, url = {https://huggingface.co/tensorfiend/DotLM-165M} }

License

https://www.apache.org/licenses/LICENSE-2.0