README.md · tensorfiend/DotLM-165M at main

File size: 4,586 Bytes

---
language:
- en
license: apache-2.0
tags:
- causal-lm
- reasoning
- thought-experiments
- chain-of-thought
- sft
- dpo
- alignment
- small-language-model
- custom-architecture
base_model: tensorfiend/DotLM-165M
datasets:
- tensorfiend/SimpleThoughts
pipeline_tag: text-generation
library_name: transformers
---

# DotLM

DotLM is a minimal 165M parameter model, from-scratch transformer trained entirely on the
[SimpleThoughts](https://huggingface.co/datasets/tensorfiend/SimpleThoughts) dataset. It uses explicit `<think>...</think>`
chain-of-thought traces to reason through intuitive physics, logic, causal inference, and other everyday phenomena before producing an
answer.

## Model Details

### Architecture

| Parameter | Value |
|---|---|
| Parameters | ~165M |
| Layers | 24 |
| Model dimension | 768 |
| FFN hidden dim | 2048 (SwiGLU) |
| Attention heads | 6 |
| KV heads (GQA) | 2 |
| Head dimension | 128 |
| Context length | 4096 tokens |
| Vocabulary size | 16,384 (BPE) |
| Positional encoding | RoPE (θ = 10,000) |
| Normalization | RMSNorm (ε = 1e-6) |
| Tied embeddings | Yes |

**Key design choices:** Grouped-Query Attention (GQA) with 3:1 head ratio for efficient KV memory, SwiGLU activations, pre-norm
architecture, and bf16 mixed-precision training throughout.

### Training Pipeline

The model was trained sequentially across four stages using the [DotLM framework](https://github.com/shanmukh05/DotLM):

| Stage | Dataset | Samples | Objective |
|---|---|---|---|
| Pretraining | SimpleThoughts/pretrain | 352,214 | Next-token prediction |
| SFT | SimpleThoughts/sft | 25,788 | ChatML instruction following |
| Alignment | SimpleThoughts/alignment | 7,172 | Reference-free DPO (SimPO-style) |
| Reasoning | SimpleThoughts/reasoning | 6,300 | Chain-of-thought with `<think>` traces |

### Special Tokens

| Token | Purpose |
|---|---|
| `<\|im_start\|>` | Start of turn (BOS) |
| `<\|im_end\|>` | End of turn |
| `<think>` | Begin reasoning trace |
| `</think>` | End reasoning trace |
| `<endoftext>` | End of sequence (EOS) |
| `<pad>` | Padding |

## Usage

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

repo_id = "tensorfiend/DotLM-165M"
device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    trust_remote_code=True,
    torch_dtype=torch.bfloat16 if torch.cuda.is_available() else torch.float32,
).to(device)

user_query = "If a ball is placed inside a box and the box is sealed, where is the ball?"

prompt = f"<|im_start|>user\n{user_query}<|im_end|>\n<|im_start|>assistant\n<think>"

inputs = tokenizer(prompt, return_tensors="pt").to(device)

outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_k=50,
    do_sample=True,
    eos_token_id=tokenizer.eos_token_id,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=False))
```

### Prompt Format

DotLM uses the ChatML format with an explicit reasoning prefix:

```
<|im_start|>user
{your question}<|im_end|>
<|im_start|>assistant
<think>
{model reasons here}
</think>
{final answer}
```

## Performance & Limitations

- Scale: At 165M parameters, DotLM is a research-scale model. It is not competitive with large-scale LLMs on general benchmarks.
- Domain: The model is specialized on thought experiments — intuitive physics, causal reasoning, spatial reasoning, theory of mind, and
related domains. It may underperform on unrelated topics.
- Reasoning quality: The chain-of-thought traces are coherent on in-distribution thought experiments but may hallucinate or ramble on
out-of-distribution inputs.
- Context: Maximum context length is 4,096 tokens.
- Safety: No RLHF safety training was applied. Not suitable for deployment in user-facing products without additional safety measures.

## Training Details

Checkout the blog for training details: [DotLM - An end-to-end trained 165M model](https://www.tensorwrites.com/) (coming soon)

Related Resources

- Dataset: [SimpleThoughts](https://huggingface.co/datasets/tensorfiend/SimpleThoughts)
- Training code: [DotLM](https://github.com/shanmukh05/DotLM) (coming soon)                      

## Citation

@misc{dotlm2026,
  author    = {Shanmukh},
  title     = {DotLM-165M: A Minimal Reasoning Language Model Trained on Thought Experiments},
  year      = {2026},
  publisher = {Hugging Face},
  url       = {https://huggingface.co/tensorfiend/DotLM-165M}
}

## License

https://www.apache.org/licenses/LICENSE-2.0