Viharikvs commited on
Commit
2b9de3b
·
verified ·
1 Parent(s): ec259cd

Model card updated after epoch 0

Browse files
Files changed (1) hide show
  1. README.md +23 -0
README.md ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: t5-small
3
+ tags: [hrm, act, dolly-15k]
4
+ metrics: [loss, perplexity]
5
+ ---
6
+ # HRM-Text1
7
+
8
+ **HRM-Text1** is an experimental instruction-following text generation model based on the **Hierarchical Recurrent Memory (HRM)** architecture. It is trained on the `databricks/databricks-dolly-15k` dataset, which consists of instruction–response pairs across multiple task types.
9
+
10
+ The model utilizes the HRM structure, consisting of a "Specialist" module for low-level processing and a "Manager" module for high-level abstraction and planning. This architecture aims to handle long-range dependencies more effectively by summarizing information at different temporal scales.
11
+
12
+ ## Model Description
13
+
14
+ - **Architecture:** Hierarchical Recurrent Memory (HRM)
15
+ - **Training Data:** [databricks/databricks-dolly-15k](https://hf.co/datasets/databricks/databricks-dolly-15k)
16
+ - **Original Paper:** [Hierarchical Reasoning Model](https://arxiv.org/abs/2506.21734)
17
+ - **Tokenizer:** `t5-small` (slow T5 SentencePiece)
18
+ - **Vocab Size**: 32100
19
+ - **Objective:** Causal Language Modeling
20
+
21
+ ### Latest Performance (Epoch 0)
22
+ - **Validation Loss**: `4.5187`
23
+ - **Validation Perplexity**: `91.72`