cmbaopenwebmath / README.md
Viharikvs's picture
Model card updated after epoch 25
8c0eb6b verified
|
raw
history blame
688 Bytes
---
base_model: t5-small
tags: [hrm, act, wikitext]
metrics: [loss, perplexity]
---
# HRM-Text1 (WikiText-103)
This repository contains weights for an experimental HRM Causal LM trained on the [WikiText-103 dataset](https://huggingface.co/datasets/wikitext/viewer/wikitext-103-raw-v1/train).
## Model Description
- **Architecture:** Hierarchical Recurrent Memory (HRM)
- **Training Data:** [wikitext/wikitext-103-raw-v1](https://huggingface.co/datasets/wikitext)
- **Tokenizer:** `t5-small` (slow T5 SentencePiece)
- **Vocab Size**: 32100
- **Objective:** Causal Language Modeling
### Latest Performance (Epoch 25)
- **Validation Loss**: `4.6005`
- **Validation Perplexity**: `99.54`