Upload app.py with huggingface_hub
Browse files
app.py
CHANGED
|
@@ -145,23 +145,116 @@ demo = gr.ChatInterface(
|
|
| 145 |
fn=chat,
|
| 146 |
title="๐ง Titans + MIRAS: A Brain That Changes Itself While Thinking",
|
| 147 |
description="""
|
| 148 |
-
|
| 149 |
|
| 150 |
-
**
|
| 151 |
-
|
| 152 |
-
2. Memory predicts what it should remember
|
| 153 |
-
3. Prediction error (loss) indicates surprise
|
| 154 |
-
4. Higher surprise โ stronger memory formation
|
| 155 |
-
5. Memory weights update via gradient descent
|
| 156 |
-
6. Response generated and memory saved to disk
|
| 157 |
|
| 158 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 159 |
""",
|
| 160 |
examples=[
|
| 161 |
-
"
|
|
|
|
| 162 |
"Tell me about test-time learning",
|
| 163 |
"What is 2+2?",
|
| 164 |
-
"
|
| 165 |
],
|
| 166 |
cache_examples=False,
|
| 167 |
theme="soft",
|
|
|
|
| 145 |
fn=chat,
|
| 146 |
title="๐ง Titans + MIRAS: A Brain That Changes Itself While Thinking",
|
| 147 |
description="""
|
| 148 |
+
## ๐ The Revolutionary Difference
|
| 149 |
|
| 150 |
+
**Standard LLMs (ChatGPT, Claude, etc.)**: Think โ Predict โ **Forget**
|
| 151 |
+
**Titans + MIRAS**: Think โ Predict โ **Update** โ **Remember** โ Think Differently
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 152 |
|
| 153 |
+
---
|
| 154 |
+
|
| 155 |
+
### ๐ก What Makes This Different?
|
| 156 |
+
|
| 157 |
+
| Feature | ChatGPT/Claude/GPT-4 | This Demo (Titans+MIRAS) |
|
| 158 |
+
|---------|---------------------|--------------------------|
|
| 159 |
+
| **Weights during chat** | ๐ Frozen forever | โ
Update with every message |
|
| 160 |
+
| **Learning** | โ Simulated (in-context only) | โ
Real (gradient descent) |
|
| 161 |
+
| **Memory** | ๐ Token context only | ๐ง Neural parameters |
|
| 162 |
+
| **Persistence** | โ Forgets when context ends | โ
Saves to disk |
|
| 163 |
+
| **Adaptation** | ๐ญ Acts like it learned | ๐ฌ Actually learns |
|
| 164 |
+
|
| 165 |
+
---
|
| 166 |
+
|
| 167 |
+
### ๐ฏ What You're Witnessing
|
| 168 |
+
|
| 169 |
+
**This is NOT a better chatbot** - it's a **learning demonstrator**.
|
| 170 |
+
|
| 171 |
+
1. **The text responses are random** - that's expected! We're using a small, frozen model (distilgpt2)
|
| 172 |
+
2. **The MAGIC is in the numbers below** - watch the "Loss" decrease when you repeat inputs!
|
| 173 |
+
3. **Every message physically changes the brain** - the memory weights update via gradient descent
|
| 174 |
+
4. **Refresh the page** - the update count continues (memory persists!)
|
| 175 |
+
|
| 176 |
+
---
|
| 177 |
+
|
| 178 |
+
### ๐งช How It Works (The Technical Truth)
|
| 179 |
+
|
| 180 |
+
```
|
| 181 |
+
Your Message
|
| 182 |
+
โ
|
| 183 |
+
[distilgpt2: FROZEN] โ Not learning, just generating
|
| 184 |
+
โ
|
| 185 |
+
Hidden States (768-dim)
|
| 186 |
+
โ
|
| 187 |
+
[Projections] โ Memory Space (256-dim)
|
| 188 |
+
โ
|
| 189 |
+
[MIRAS Memory: LEARNING!] โ This is what updates!
|
| 190 |
+
โ
|
| 191 |
+
Loss = How surprised the memory is
|
| 192 |
+
โ
|
| 193 |
+
Gradient Descent โ Memory weights change
|
| 194 |
+
โ
|
| 195 |
+
Saved to disk โ Persists forever
|
| 196 |
+
```
|
| 197 |
+
|
| 198 |
+
**Key Insight**: We're training the **memory**, not the text generator!
|
| 199 |
+
|
| 200 |
+
---
|
| 201 |
+
|
| 202 |
+
### ๐ฌ The Science: Why This Matters
|
| 203 |
+
|
| 204 |
+
**Standard LLMs**:
|
| 205 |
+
- Weights frozen after training (costs millions)
|
| 206 |
+
- "Learning" is just pattern matching in context
|
| 207 |
+
- Forget everything when context ends
|
| 208 |
+
- Same model for everyone
|
| 209 |
+
|
| 210 |
+
**Titans + MIRAS**:
|
| 211 |
+
- Weights update during inference (free!)
|
| 212 |
+
- Real optimization via gradient descent
|
| 213 |
+
- Memory persists across sessions
|
| 214 |
+
- Personalizes to each user
|
| 215 |
+
|
| 216 |
+
**This is test-time learning** - the future of adaptive AI.
|
| 217 |
+
|
| 218 |
+
---
|
| 219 |
+
|
| 220 |
+
### ๐ What the Stats Mean
|
| 221 |
+
|
| 222 |
+
- **Loss**: How surprised the memory is (lower = more familiar)
|
| 223 |
+
- **Retention**: Learning rate multiplier (2.0x = very surprising, 0.5x = familiar)
|
| 224 |
+
- **Updates**: Total number of memory updates (persists across sessions!)
|
| 225 |
+
- **Avg Loss**: Overall learning progress
|
| 226 |
+
|
| 227 |
+
---
|
| 228 |
+
|
| 229 |
+
### ๐ฎ Try This Experiment
|
| 230 |
+
|
| 231 |
+
1. **Send "hello world" 5 times** โ Watch loss decrease!
|
| 232 |
+
2. **Send something completely different** โ Loss spikes!
|
| 233 |
+
3. **Refresh the page and send another message** โ Update count continues!
|
| 234 |
+
|
| 235 |
+
**That decreasing loss is proof the neural weights are changing!**
|
| 236 |
+
|
| 237 |
+
---
|
| 238 |
+
|
| 239 |
+
### ๐ The Bottom Line
|
| 240 |
+
|
| 241 |
+
**ChatGPT**: A frozen calculator that *simulates* adaptation
|
| 242 |
+
**This Demo**: A living system that *performs* adaptation
|
| 243 |
+
|
| 244 |
+
You're not chatting with a model.
|
| 245 |
+
**You're watching a brain rewire itself in real-time.** ๐ง โก
|
| 246 |
+
|
| 247 |
+
---
|
| 248 |
+
|
| 249 |
+
*Built with Titans (test-time training) + MIRAS (associative memory)*
|
| 250 |
+
*Papers: [Titans](https://arxiv.org/abs/2501.00663) | [MIRAS](https://arxiv.org/abs/2504.13173)*
|
| 251 |
""",
|
| 252 |
examples=[
|
| 253 |
+
"hello world",
|
| 254 |
+
"hello world", # Repeat to show learning!
|
| 255 |
"Tell me about test-time learning",
|
| 256 |
"What is 2+2?",
|
| 257 |
+
"my name is [your name]",
|
| 258 |
],
|
| 259 |
cache_examples=False,
|
| 260 |
theme="soft",
|