Viharikvs
/

cmbaopenwebmath

Model card Files Files and versions

cmbaopenwebmath / README.md

Viharikvs's picture

Model card updated after epoch 25

8c0eb6b verified 4 months ago

|

688 Bytes

	---
	base_model: t5-small
	tags: [hrm, act, wikitext]
	metrics: [loss, perplexity]
	---
	# HRM-Text1 (WikiText-103)

	This repository contains weights for an experimental HRM Causal LM trained on the [WikiText-103 dataset](https://huggingface.co/datasets/wikitext/viewer/wikitext-103-raw-v1/train).

	## Model Description

	- Architecture: Hierarchical Recurrent Memory (HRM)
	- Training Data: [wikitext/wikitext-103-raw-v1](https://huggingface.co/datasets/wikitext)
	- Tokenizer: `t5-small` (slow T5 SentencePiece)
	- Vocab Size: 32100
	- Objective: Causal Language Modeling

	### Latest Performance (Epoch 25)
	- Validation Loss: `4.6005`
	- Validation Perplexity: `99.54`