--- language: en license: apache-2.0 tags: - world-model - rssm - tutoring - predictive-model - pytorch - kat library_name: pytorch pipeline_tag: reinforcement-learning model-index: - name: kat-2-RSSM results: - task: type: world-modeling name: Tutoring State Prediction metrics: - name: Eval Loss (best) type: loss value: 0.3124 - name: Reconstruction Loss type: loss value: 0.1389 - name: KL Divergence type: loss value: 0.0104 - name: Reward Loss type: loss value: 0.082 - name: Done Loss type: loss value: 0.064 --- # KAT-2-RSSM A **Recurrent State-Space Model** trained for tutoring state prediction, part of the **KAT** system by [Progga AI](https://progga.ai). ## Model Description This is a complete world model for predicting tutoring session dynamics — student state transitions, reward signals, and session termination. It uses a DreamerV3-inspired RSSM architecture with VL-JEPA-style EMA target encoding. ### Architecture ``` TutoringRSSM (2,802,838 params) ├── ObservationEncoder: obs_dim(20) → encoder_hidden(256) → latent_dim(128) ├── ActionEmbedding: action_dim(8) → embed_dim(32) ├── DeterministicTransition: GRU(hidden_dim=512) ├── StochasticLatent: Diagonal Gaussian prior/posterior (latent_dim=128) ├── ObservationDecoder: feature_dim(640) → decoder_hidden(256) → obs_dim(20) ├── RewardPredictor: feature_dim(640) → 1 ├── DonePredictor: feature_dim(640) → 1 └── EMATargetEncoder: momentum=0.996 (VL-JEPA heritage) ``` **Feature dimension**: `hidden_dim + latent_dim = 512 + 128 = 640` ### Observation Space (20-dim) The 20-dimensional observation vector encodes tutoring session state: | Dims | Signal | |------|--------| | 0-3 | Mastery estimates (per-topic confidence) | | 4-7 | Engagement signals (attention, participation) | | 8-11 | Response quality (accuracy, depth, speed) | | 12-15 | Emotional state (frustration, confidence, curiosity) | | 16-19 | Session context (time, hint level, attempt count) | ### Action Space (8 discrete actions) | Index | Strategy | |-------|----------| | 0 | SOCRATIC — Guided questioning | | 1 | SCAFFOLDED — Structured support | | 2 | DIRECT — Direct instruction | | 3 | EXPLORATORY — Open exploration | | 4 | REMEDIAL — Error correction | | 5 | ASSESSMENT — Knowledge check | | 6 | MOTIVATIONAL — Encouragement | | 7 | METACOGNITIVE — Reflection | ## Training Details - **Data**: 100,901 synthetic tutoring trajectories (95,856 train / 5,045 eval) - **Epochs**: 100 (best at epoch 93) - **Hardware**: NVIDIA A100-SXM4-40GB - **Optimizer**: Adam (lr=3e-4) - **Training time**: ~45 minutes - **Framework**: PyTorch 2.x ### Training Metrics (Best Checkpoint — Epoch 93) | Metric | Value | |--------|-------| | **Total Loss** | 0.3124 | | Reconstruction Loss | 0.1389 | | KL Divergence | 0.0104 | | Reward Loss | 0.0820 | | Done Loss | 0.0640 | | Rollout Loss | 0.3294 | ### Training Curve Training converged smoothly over 100 epochs with consistent eval loss improvement. No catastrophic forgetting or training instability observed. ## Files | File | Description | Size | |------|-------------|------| | `tutoring_rssm_best.pt` | Best checkpoint (epoch 93, eval loss 0.3124) | 11 MB | | `tutoring_rssm_final.pt` | Final checkpoint (epoch 100) | 11 MB | | `tutoring_rssm_epoch{N}.pt` | Snapshots every 10 epochs | 11 MB each | | `v1-backup/` | RSSM v1 checkpoints (smaller model) | ~800 KB each | | `training_log.txt` | Full training log | ~8 KB | | `config.json` | Model configuration | <1 KB | | `architecture.py` | Standalone model definition | ~20 KB | ## Usage ```python import torch from architecture import TutoringRSSM, TutoringWorldModelConfig # Load model config = TutoringWorldModelConfig( obs_dim=20, action_dim=8, latent_dim=128, hidden_dim=512, encoder_hidden=256, decoder_hidden=256, ) model = TutoringRSSM(config).cuda() ckpt = torch.load("tutoring_rssm_best.pt", map_location="cuda") model.load_state_dict(ckpt["model_state_dict"]) model.eval() # Initialize state h, z = model.initial_state(batch_size=1) # Observe a tutoring step obs = torch.randn(1, 20).cuda() # Student observation action = torch.tensor([0]).cuda() # SOCRATIC strategy result = model.observe_step(h, z, action, obs) h_new, z_new = result["h"], result["z"] pred_obs = result["pred_obs"] # Predicted next observation pred_reward = result["pred_reward"] # Predicted reward pred_done = result["pred_done"] # Predicted session end # Imagination (planning without observation) imagined = model.imagine_step(h_new, z_new, torch.tensor([3]).cuda()) # Returns predicted state without requiring real observation ``` ## Evaluation Results (94/94 tests pass) | Component | Tests | Status | |-----------|-------|--------| | Predictive Student Model | 44/44 | ALL PASS | | Cognition World Model Eval | 2/2 | ALL ACCEPTANCE MET | | Core PyTorch RSSM | 10/10 | ALL PASS | | Physics/Causality Micro-Modules | 23/23 | ALL PASS | | Trained Checkpoint Inference | 7/7 | ALL PASS | | Advanced Planners (MCTS/Beam) | 8/8 | ALL PASS | ### Acceptance Criteria - **Prediction accuracy**: 12.08% error at horizon (target <20%) ✓ - **Planning improvement**: +14.5% vs reactive baseline (target >+10%) ✓ ## Heritage This model inherits from the **Abigail3 cognitive architecture**, specifically: - RSSM design from `abigail/core/world_model.py` - VL-JEPA EMA target encoding from Meta AI's Joint-Embedding Predictive Architecture - DreamerV3-inspired training with KL balancing and rollout losses - Governance-first design: generation separated from governance ## Ecosystem This world model is part of the broader KAT system: - **23 physics/causality micro-modules** (67M params total) — intuitive physics simulation - **MCTS Planner** — Monte Carlo Tree Search for action planning - **Beam Search Planner** — Anytime approximate planning - **Causal World Model** — Structural causal model with do-calculus - **Predictive Student Model** — VL-JEPA/RSSM adapted for tutoring personalization ## License Apache 2.0 ## Author **Preston Mills** — Progga AI - Built for KAT-2 framework - Designed by Progga AI - February 2026