Viharikvs
/

cmbaopenwebmath

+---
+base_model: t5-small
+tags: [hrm, act, wikitext]
+metrics: [loss, perplexity]
+---
+# HRM-Text1 (WikiText-103)
+This repository contains weights for an experimental HRM Causal LM trained on the [WikiText-103 dataset](https://huggingface.co/datasets/wikitext/viewer/wikitext-103-raw-v1/train).
+## Model Description
+- **Architecture:** Hierarchical Recurrent Memory (HRM)
+- **Training Data:** [wikitext/wikitext-103-raw-v1](https://huggingface.co/datasets/wikitext)
+- **Tokenizer:** `t5-small` (slow T5 SentencePiece)
+- **Vocab Size**: 32100
+- **Objective:** Causal Language Modeling
+### Latest Performance (Epoch 2)
+- **Validation Loss**: `6.7866`
+- **Validation Perplexity**: `885.93`