HRM-Text1-UltraChat / README.md
Viharikvs's picture
Model card updated after epoch 19
4cbc38a verified
|
raw
history blame
1.18 kB
metadata
base_model: t5-small
tags:
  - hrm
  - act
  - dolly-15k
metrics:
  - loss
  - perplexity

HRM-Text1

HRM-Text1 is an experimental instruction-following text generation model based on the Hierarchical Recurrent Memory (HRM) architecture. It is trained on the databricks/databricks-dolly-15k dataset, which consists of instruction–response pairs across multiple task types.

The model utilizes the HRM structure, consisting of a "Specialist" module for low-level processing and a "Manager" module for high-level abstraction and planning. This architecture aims to handle long-range dependencies more effectively by summarizing information at different temporal scales.

Model Description

Latest Performance (Epoch 19)

  • Validation Loss: 3.6484
  • Validation Perplexity: 38.41