Viharikvs
/

cmbaopenwebmath

Model card Files Files and versions

cmbaopenwebmath / README.md

Viharikvs's picture

Model card updated after epoch 30

7ec7bd8 verified 3 months ago

|

history blame contribute delete

688 Bytes

	---
	base_model: t5-small
	tags: [hrm, act, wikitext]
	metrics: [loss, perplexity]
	---
	# HRM-Text1 (WikiText-103)

	This repository contains weights for an experimental HRM Causal LM trained on the [WikiText-103 dataset](https://huggingface.co/datasets/wikitext/viewer/wikitext-103-raw-v1/train).

	## Model Description

	- Architecture: Hierarchical Recurrent Memory (HRM)
	- Training Data: [wikitext/wikitext-103-raw-v1](https://huggingface.co/datasets/wikitext)
	- Tokenizer: `t5-small` (slow T5 SentencePiece)
	- Vocab Size: 32100
	- Objective: Causal Language Modeling

	### Latest Performance (Epoch 30)
	- Validation Loss: `4.5848`
	- Validation Perplexity: `97.98`