cmbaopenwebmath / README.md
Viharikvs's picture
Model card updated after epoch 30
7ec7bd8 verified
---
base_model: t5-small
tags: [hrm, act, wikitext]
metrics: [loss, perplexity]
---
# HRM-Text1 (WikiText-103)
This repository contains weights for an experimental HRM Causal LM trained on the [WikiText-103 dataset](https://huggingface.co/datasets/wikitext/viewer/wikitext-103-raw-v1/train).
## Model Description
- **Architecture:** Hierarchical Recurrent Memory (HRM)
- **Training Data:** [wikitext/wikitext-103-raw-v1](https://huggingface.co/datasets/wikitext)
- **Tokenizer:** `t5-small` (slow T5 SentencePiece)
- **Vocab Size**: 32100
- **Objective:** Causal Language Modeling
### Latest Performance (Epoch 30)
- **Validation Loss**: `4.5848`
- **Validation Perplexity**: `97.98`