File size: 1,175 Bytes
2b9de3b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4bea19c
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
---
base_model: t5-small
tags: [hrm, act, dolly-15k]
metrics: [loss, perplexity]
---
# HRM-Text1

**HRM-Text1** is an experimental instruction-following text generation model based on the **Hierarchical Recurrent Memory (HRM)** architecture. It is trained on the `databricks/databricks-dolly-15k` dataset, which consists of instruction–response pairs across multiple task types.

The model utilizes the HRM structure, consisting of a "Specialist" module for low-level processing and a "Manager" module for high-level abstraction and planning. This architecture aims to handle long-range dependencies more effectively by summarizing information at different temporal scales.

## Model Description

- **Architecture:** Hierarchical Recurrent Memory (HRM)
- **Training Data:** [databricks/databricks-dolly-15k](https://hf.co/datasets/databricks/databricks-dolly-15k)
- **Original Paper:** [Hierarchical Reasoning Model](https://arxiv.org/abs/2506.21734)
- **Tokenizer:** `t5-small` (slow T5 SentencePiece)
- **Vocab Size**: 32100
- **Objective:** Causal Language Modeling

### Latest Performance (Epoch 20)
- **Validation Loss**: `3.6668`
- **Validation Perplexity**: `39.13`