metadata
base_model: t5-small
tags:
- act
- wikitext
metrics:
- loss
- perplexity
HRM-Text1 (WikiText-103)
This repository contains weights for an experimental trained on the WikiText-103 dataset.
Model Description
- Architecture: CMBA
- Training Data: wikitext/wikitext-103-raw-v1
- Tokenizer:
t5-small(slow T5 SentencePiece) - Vocab Size: 32100
- Objective: Causal Language Modeling
Latest Performance (Epoch 0)
- Validation Loss:
29.7877 - Validation Perplexity:
8642211872768.00