Viharikvs
/

CMBATRM

+---
+base_model: t5-small
+tags: [trm, act, recursive, text-generation, wikitext]
+metrics: [loss, lm_loss, ponder_loss, perplexity_lm]
+---
+# TRM-Text1 (ACT)
+**TRM-Text1 (ACT)** is a causal language model based on a **Tiny Recursive Reasoning Model (TRM)** with **Adaptive Computation Time (ACT)** for per-token variable depth.
+- **Architecture:** TRM (causal) + ACT halting
+- **Training Data:** wikitext-103-raw-v1
+- **Tokenizer:** t5-small (SentencePiece)
+- **Vocab Size:** 32100
+- **Objective:** Causal Language Modeling (next-token)
+- **Seq Len:** 1024
+Note: This model uses the T5 SentencePiece tokenizer. Perplexity numbers on WT103
+reported here are not directly comparable to GPT-2 BPE-based PPLs.
+### Latest Performance (Epoch 0)
+- **Validation Loss**: 4.8829
+- **Validation LM Loss**: 4.8728
+- **Validation Ponder Loss**: 1.0091
+- **Validation Perplexity (LM-only)**: 130.69