---
language: en
license: apache-2.0
tags:
- world-model
- rssm
- tutoring
- predictive-model
- pytorch
- kat
library_name: pytorch
pipeline_tag: reinforcement-learning
model-index:
- name: kat-2-RSSM
  results:
  - task:
      type: world-modeling
      name: Tutoring State Prediction
    metrics:
    - name: Eval Loss (best)
      type: loss
      value: 0.3124
    - name: Reconstruction Loss
      type: loss
      value: 0.1389
    - name: KL Divergence
      type: loss
      value: 0.0104
    - name: Reward Loss
      type: loss
      value: 0.082
    - name: Done Loss
      type: loss
      value: 0.064
---

# KAT-2-RSSM

A **Recurrent State-Space Model** trained for tutoring state prediction, part of the **KAT** system by [Progga AI](https://progga.ai).

## Model Description

This is a complete world model for predicting tutoring session dynamics — student state transitions, reward signals, and session termination. It uses a DreamerV3-inspired RSSM architecture with VL-JEPA-style EMA target encoding.

### Architecture

```
TutoringRSSM (2,802,838 params)
├── ObservationEncoder: obs_dim(20) → encoder_hidden(256) → latent_dim(128)
├── ActionEmbedding: action_dim(8) → embed_dim(32)
├── DeterministicTransition: GRU(hidden_dim=512)
├── StochasticLatent: Diagonal Gaussian prior/posterior (latent_dim=128)
├── ObservationDecoder: feature_dim(640) → decoder_hidden(256) → obs_dim(20)
├── RewardPredictor: feature_dim(640) → 1
├── DonePredictor: feature_dim(640) → 1
└── EMATargetEncoder: momentum=0.996 (VL-JEPA heritage)
```

**Feature dimension**: `hidden_dim + latent_dim = 512 + 128 = 640`

### Observation Space (20-dim)

The 20-dimensional observation vector encodes tutoring session state:

| Dims | Signal |
|------|--------|
| 0-3 | Mastery estimates (per-topic confidence) |
| 4-7 | Engagement signals (attention, participation) |
| 8-11 | Response quality (accuracy, depth, speed) |
| 12-15 | Emotional state (frustration, confidence, curiosity) |
| 16-19 | Session context (time, hint level, attempt count) |

### Action Space (8 discrete actions)

| Index | Strategy |
|-------|----------|
| 0 | SOCRATIC — Guided questioning |
| 1 | SCAFFOLDED — Structured support |
| 2 | DIRECT — Direct instruction |
| 3 | EXPLORATORY — Open exploration |
| 4 | REMEDIAL — Error correction |
| 5 | ASSESSMENT — Knowledge check |
| 6 | MOTIVATIONAL — Encouragement |
| 7 | METACOGNITIVE — Reflection |

## Training Details

- **Data**: 100,901 synthetic tutoring trajectories (95,856 train / 5,045 eval)
- **Epochs**: 100 (best at epoch 93)
- **Hardware**: NVIDIA A100-SXM4-40GB
- **Optimizer**: Adam (lr=3e-4)
- **Training time**: ~45 minutes
- **Framework**: PyTorch 2.x

### Training Metrics (Best Checkpoint — Epoch 93)

| Metric | Value |
|--------|-------|
| **Total Loss** | 0.3124 |
| Reconstruction Loss | 0.1389 |
| KL Divergence | 0.0104 |
| Reward Loss | 0.0820 |
| Done Loss | 0.0640 |
| Rollout Loss | 0.3294 |

### Training Curve

Training converged smoothly over 100 epochs with consistent eval loss improvement. No catastrophic forgetting or training instability observed.

## Files

| File | Description | Size |
|------|-------------|------|
| `tutoring_rssm_best.pt` | Best checkpoint (epoch 93, eval loss 0.3124) | 11 MB |
| `tutoring_rssm_final.pt` | Final checkpoint (epoch 100) | 11 MB |
| `tutoring_rssm_epoch{N}.pt` | Snapshots every 10 epochs | 11 MB each |
| `v1-backup/` | RSSM v1 checkpoints (smaller model) | ~800 KB each |
| `training_log.txt` | Full training log | ~8 KB |
| `config.json` | Model configuration | <1 KB |
| `architecture.py` | Standalone model definition | ~20 KB |

## Usage

```python
import torch
from architecture import TutoringRSSM, TutoringWorldModelConfig

# Load model
config = TutoringWorldModelConfig(
    obs_dim=20, action_dim=8,
    latent_dim=128, hidden_dim=512,
    encoder_hidden=256, decoder_hidden=256,
)
model = TutoringRSSM(config).cuda()

ckpt = torch.load("tutoring_rssm_best.pt", map_location="cuda")
model.load_state_dict(ckpt["model_state_dict"])
model.eval()

# Initialize state
h, z = model.initial_state(batch_size=1)

# Observe a tutoring step
obs = torch.randn(1, 20).cuda()  # Student observation
action = torch.tensor([0]).cuda()  # SOCRATIC strategy
result = model.observe_step(h, z, action, obs)

h_new, z_new = result["h"], result["z"]
pred_obs = result["pred_obs"]       # Predicted next observation
pred_reward = result["pred_reward"]  # Predicted reward
pred_done = result["pred_done"]      # Predicted session end

# Imagination (planning without observation)
imagined = model.imagine_step(h_new, z_new, torch.tensor([3]).cuda())
# Returns predicted state without requiring real observation
```

## Evaluation Results (94/94 tests pass)

| Component | Tests | Status |
|-----------|-------|--------|
| Predictive Student Model | 44/44 | ALL PASS |
| Cognition World Model Eval | 2/2 | ALL ACCEPTANCE MET |
| Core PyTorch RSSM | 10/10 | ALL PASS |
| Physics/Causality Micro-Modules | 23/23 | ALL PASS |
| Trained Checkpoint Inference | 7/7 | ALL PASS |
| Advanced Planners (MCTS/Beam) | 8/8 | ALL PASS |

### Acceptance Criteria

- **Prediction accuracy**: 12.08% error at horizon (target <20%) ✓
- **Planning improvement**: +14.5% vs reactive baseline (target >+10%) ✓

## Heritage

This model inherits from the **Abigail3 cognitive architecture**, specifically:
- RSSM design from `abigail/core/world_model.py`
- VL-JEPA EMA target encoding from Meta AI's Joint-Embedding Predictive Architecture
- DreamerV3-inspired training with KL balancing and rollout losses
- Governance-first design: generation separated from governance

## Ecosystem

This world model is part of the broader KAT system:
- **23 physics/causality micro-modules** (67M params total) — intuitive physics simulation
- **MCTS Planner** — Monte Carlo Tree Search for action planning
- **Beam Search Planner** — Anytime approximate planning
- **Causal World Model** — Structural causal model with do-calculus
- **Predictive Student Model** — VL-JEPA/RSSM adapted for tutoring personalization

## License

Apache 2.0

## Author

**Preston Mills** — Progga AI 
- Built for KAT-2 framework
- Designed by Progga AI
- February 2026