Pavantej commited on
Commit
6fc70f4
·
verified ·
1 Parent(s): e62425b

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +71 -4
README.md CHANGED
@@ -1,4 +1,71 @@
1
- torch
2
- transformers
3
- gradio==4.36.1
4
- numpy
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Titans Miras Demo
3
+ emoji: 🔬
4
+ colorFrom: blue
5
+ colorTo: purple
6
+ sdk: gradio
7
+ sdk_version: 4.36.1
8
+ app_file: app.py
9
+ pinned: false
10
+ ---
11
+
12
+ # Titans + MIRAS: A Brain That Changes Itself While Thinking
13
+
14
+ A minimal but faithful reimplementation of **Titans** (test-time learning) and **MIRAS** (associative memory framework) using open-source models on Hugging Face.
15
+
16
+ ## What is this?
17
+
18
+ This demo showcases a neural architecture that can **learn and update its memory while generating responses** - a brain that literally changes itself while thinking!
19
+
20
+ ### Key Features
21
+
22
+ - 🔄 **Test-time learning**: Memory updates during inference (not just training)
23
+ - 🎯 **Retention gate**: Surprising/novel inputs are more memorable (inspired by human memory)
24
+ - 💾 **Persistent memory**: State is saved across sessions
25
+ - 🤖 **Fully OSS**: Uses distilgpt2 and runs entirely on Hugging Face
26
+
27
+ ## Architecture
28
+
29
+ ```
30
+ User Input
31
+
32
+ [Base LM: distilgpt2] → Hidden States (768-dim)
33
+
34
+ [Key/Value Projections] → Memory Space (256-dim)
35
+
36
+ [MIRAS Memory Module] ← Test-time Gradient Updates
37
+
38
+ [Text Generation] → Response + Memory Stats
39
+ ```
40
+
41
+ ### Components
42
+
43
+ 1. **Base Language Model**: distilgpt2 (frozen, no training)
44
+ 2. **Projection Layers**: Map hidden states to memory space
45
+ 3. **MIRAS Memory**: Associative memory with learnable key→value mapping
46
+ 4. **Retention Gate**: Adjusts learning rate based on surprise (loss magnitude)
47
+ 5. **Memory Store**: Persists memory state to disk
48
+
49
+ ## How It Works
50
+
51
+ 1. Input text is processed through distilgpt2
52
+ 2. Last hidden state is projected to key/value pairs
53
+ 3. Memory predicts value from key
54
+ 4. Loss (prediction error) indicates surprise
55
+ 5. Higher surprise → higher retention → faster learning
56
+ 6. Memory updated via gradient descent (1e-3 base LR)
57
+ 7. Response generated and memory saved
58
+
59
+ ## References
60
+
61
+ - **Titans**: [Learning to Memorize at Test Time](https://arxiv.org/abs/2501.00663)
62
+ - **MIRAS**: [Framework for Associative Memory with Attentional Bias](https://arxiv.org/abs/2504.13173)
63
+
64
+ ## Running Locally
65
+
66
+ ```bash
67
+ pip install -r requirements.txt
68
+ python app.py
69
+ ```
70
+
71
+ Built with ❤️ exploring the future of adaptive AI systems.