| base_model: t5-small | |
| tags: [hrm, act, wikitext] | |
| metrics: [loss, perplexity] | |
| # HRM-Text1 (WikiText-103) | |
| This repository contains weights for an experimental HRM Causal LM trained on the [WikiText-103 dataset](https://huggingface.co/datasets/wikitext/viewer/wikitext-103-raw-v1/train). | |
| ## Model Description | |
| - **Architecture:** Hierarchical Recurrent Memory (HRM) | |
| - **Training Data:** [wikitext/wikitext-103-raw-v1](https://huggingface.co/datasets/wikitext) | |
| - **Tokenizer:** `t5-small` (slow T5 SentencePiece) | |
| - **Vocab Size**: 32100 | |
| - **Objective:** Causal Language Modeling | |
| ### Latest Performance (Epoch 30) | |
| - **Validation Loss**: `4.5848` | |
| - **Validation Perplexity**: `97.98` | |