Pavantej commited on
Commit
ecf2c47
·
verified ·
1 Parent(s): 32a6409

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +4 -72
README.md CHANGED
@@ -1,72 +1,4 @@
1
- ---
2
- title: Titans Miras Demo
3
- emoji: 🧠
4
- colorFrom: blue
5
- colorTo: purple
6
- sdk: gradio
7
- sdk_version: 4.44.0
8
- app_file: app.py
9
- pinned: false
10
- ---
11
-
12
- # 🧠 Titans + MIRAS: A Brain That Changes Itself While Thinking
13
-
14
- A minimal but faithful reimplementation of **Titans** (test-time learning) and **MIRAS** (associative memory framework) using open-source models on Hugging Face.
15
-
16
- ## What is this?
17
-
18
- This demo showcases a neural architecture that can **learn and update its memory while generating responses** - a brain that literally changes itself while thinking!
19
-
20
- ### Key Features
21
-
22
- - 🔄 **Test-time learning**: Memory updates during inference (not just training)
23
- - 🎯 **Retention gate**: Surprising/novel inputs are more memorable (inspired by human memory)
24
- - 💾 **Persistent memory**: State is saved across sessions
25
- - 🤖 **Fully OSS**: Uses distilgpt2 and runs entirely on Hugging Face
26
-
27
- ## Architecture
28
-
29
- ```
30
- User Input
31
-
32
- [Base LM: distilgpt2] → Hidden States (768-dim)
33
-
34
- [Key/Value Projections] → Memory Space (256-dim)
35
-
36
- [MIRAS Memory Module] ← Test-time Gradient Updates
37
-
38
- [Text Generation] → Response + Memory Stats
39
- ```
40
-
41
- ### Components
42
-
43
- 1. **Base Language Model**: distilgpt2 (frozen, no training)
44
- 2. **Projection Layers**: Map hidden states to memory space
45
- 3. **MIRAS Memory**: Associative memory with learnable key→value mapping
46
- 4. **Retention Gate**: Adjusts learning rate based on surprise (loss magnitude)
47
- 5. **Memory Store**: Persists memory state to disk
48
-
49
- ## How It Works
50
-
51
- 1. Input text is processed through distilgpt2
52
- 2. Last hidden state is projected to key/value pairs
53
- 3. Memory predicts value from key
54
- 4. Loss (prediction error) indicates surprise
55
- 5. Higher surprise → higher retention → faster learning
56
- 6. Memory updated via gradient descent (1e-3 base LR)
57
- 7. Response generated and memory saved
58
-
59
- ## References
60
-
61
- - **Titans**: [Learning to Memorize at Test Time](https://arxiv.org/abs/2501.00663)
62
- - **MIRAS**: [Framework for Associative Memory with Attentional Bias](https://arxiv.org/abs/2504.13173)
63
-
64
- ## Running Locally
65
-
66
- ```bash
67
- pip install -r requirements.txt
68
- python app.py
69
- ```
70
-
71
- Built with ❤️ exploring the future of adaptive AI systems.
72
-
 
1
+ torch
2
+ transformers
3
+ gradio==3.50.2
4
+ numpy