TACIT - Transformation-Aware Capturing of Implicit Thought

TACIT is a diffusion-based transformer model that learns to solve mazes. The model demonstrates emergent reasoning capabilities without explicit language supervision, making it an ideal subject for interpretability research on world models.

Model Evolution

Model Description

TACIT learns to transform images of unsolved mazes into solved mazes using a flow-matching diffusion process. The model develops internal representations of maze structure and pathfinding without being explicitly programmed to do so.

Architecture

Component Specification
Type Diffusion Transformer (DiT)
Hidden Dimension 384
Transformer Blocks 8
Attention Heads 6
Patch Size 8×8
Input/Output 64×64 RGB images
Parameters ~20M

Training

  • Dataset: 1,000,000 maze problem-solution pairs
  • Epochs: 100
  • Final Loss: 6.25e-06 (MSE)
  • Final L2 Distance: 0.0014

The model learns to predict the velocity field in a flow-matching formulation, transforming unsolved mazes into their solutions through iterative refinement.

Usage

import torch
from safetensors.torch import load_file

# Model architecture (copy from tacit/models/dit.py or install tacit package)
from tacit import TACITModel, sample_euler_method

# Load model
model = TACITModel()
state_dict = load_file('tacit_epoch_100.safetensors')

# Handle compiled model checkpoint
if list(state_dict.keys())[0].startswith('_orig_mod.'):
    state_dict = {k.replace('_orig_mod.', ''): v for k, v in state_dict.items()}

model.load_state_dict(state_dict)
model.eval()

# Inference
# x0: input maze tensor (batch, 3, 64, 64), values in [0, 1]
with torch.no_grad():
    solution = sample_euler_method(model, x0, num_steps=10)

Inference Configuration

Parameter Value
Sampling Method Euler
Recommended Steps 10
Output Range [0, 1]

Maze Format

  • Resolution: 64×64 pixels, RGB
  • Color Scheme:
    • White (255, 255, 255): Paths
    • Black (0, 0, 0): Walls
    • Green (0, 255, 0): Entry/Exit points
    • Red (255, 0, 0): Solution path (in solved mazes)

Research Applications

This model is designed for interpretability research, particularly:

  1. Mechanistic Interpretability: Understanding how the model represents maze structure internally
  2. World Models: Studying emergent spatial reasoning without language
  3. Diffusion Interpretability: Analyzing intermediate denoising steps

Repository

Full source code, training scripts, and additional checkpoints available at: GitHub Repository

Citation

@software{tacit2024,
  title={TACIT: Transformation-Aware Capturing of Implicit Thought},
  author={Daniel},
  year={2024},
  url={https://huggingface.co/tylerxdurden/tacit}
}

License

Apache License 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support