TACIT - Transformation-Aware Capturing of Implicit Thought
TACIT is a diffusion-based transformer model that learns to solve mazes. The model demonstrates emergent reasoning capabilities without explicit language supervision, making it an ideal subject for interpretability research on world models.
Model Description
TACIT learns to transform images of unsolved mazes into solved mazes using a flow-matching diffusion process. The model develops internal representations of maze structure and pathfinding without being explicitly programmed to do so.
Architecture
| Component | Specification |
|---|---|
| Type | Diffusion Transformer (DiT) |
| Hidden Dimension | 384 |
| Transformer Blocks | 8 |
| Attention Heads | 6 |
| Patch Size | 8×8 |
| Input/Output | 64×64 RGB images |
| Parameters | ~20M |
Training
- Dataset: 1,000,000 maze problem-solution pairs
- Epochs: 100
- Final Loss: 6.25e-06 (MSE)
- Final L2 Distance: 0.0014
The model learns to predict the velocity field in a flow-matching formulation, transforming unsolved mazes into their solutions through iterative refinement.
Usage
import torch
from safetensors.torch import load_file
# Model architecture (copy from tacit/models/dit.py or install tacit package)
from tacit import TACITModel, sample_euler_method
# Load model
model = TACITModel()
state_dict = load_file('tacit_epoch_100.safetensors')
# Handle compiled model checkpoint
if list(state_dict.keys())[0].startswith('_orig_mod.'):
state_dict = {k.replace('_orig_mod.', ''): v for k, v in state_dict.items()}
model.load_state_dict(state_dict)
model.eval()
# Inference
# x0: input maze tensor (batch, 3, 64, 64), values in [0, 1]
with torch.no_grad():
solution = sample_euler_method(model, x0, num_steps=10)
Inference Configuration
| Parameter | Value |
|---|---|
| Sampling Method | Euler |
| Recommended Steps | 10 |
| Output Range | [0, 1] |
Maze Format
- Resolution: 64×64 pixels, RGB
- Color Scheme:
- White (255, 255, 255): Paths
- Black (0, 0, 0): Walls
- Green (0, 255, 0): Entry/Exit points
- Red (255, 0, 0): Solution path (in solved mazes)
Research Applications
This model is designed for interpretability research, particularly:
- Mechanistic Interpretability: Understanding how the model represents maze structure internally
- World Models: Studying emergent spatial reasoning without language
- Diffusion Interpretability: Analyzing intermediate denoising steps
Repository
Full source code, training scripts, and additional checkpoints available at: GitHub Repository
Citation
@software{tacit2024,
title={TACIT: Transformation-Aware Capturing of Implicit Thought},
author={Daniel},
year={2024},
url={https://huggingface.co/tylerxdurden/tacit}
}
License
Apache License 2.0
