mindchain commited on
Commit
e1eac63
·
1 Parent(s): a15826a

Add Gemma Scope 2 + Neuronpedia Discovery/Steering/Freezing Skills

Browse files
Files changed (1) hide show
  1. index.html +35 -4
index.html CHANGED
@@ -158,13 +158,44 @@ Plus im Gateway: GitHub, Sentry, Z-Image, Web-Search, Browser Automation
158
 
159
  <strong>Die Kombination:</strong> Ralph liefert die Schleife, Beads das Gedächtnis, HF Skills das Lernen.
160
 
161
- Plus: Gemma Scope 2 + Neuronpedia für mechanistic interpretability - sieh WAS der Agent lernt!
162
-
163
- Links:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
164
  <a href="https://github.com/anthropics/claude-code/tree/main/plugins/ralph-wiggum" class="link">Ralph Wiggum GitHub</a>
165
  <a href="https://github.com/steveyegge/beads" class="link">Beads GitHub</a>
166
  <a href="https://github.com/huggingface/skills" class="link">HF Skills GitHub</a>
167
- <a href="https://huggingface.co/blog/hf-skills-training" class="link">HF Skills Blog</a></div>
 
 
168
  </div>
169
 
170
  <div class="post">
 
158
 
159
  <strong>Die Kombination:</strong> Ralph liefert die Schleife, Beads das Gedächtnis, HF Skills das Lernen.
160
 
161
+ <strong>5. Gemma Scope 2 + Neuronpedia (Interpretability + Steering)</strong>
162
+ Das Agent-Training wird transparent und steuerbar.
163
+
164
+ <span style="color: #667eea;">Discovery Skills</span> - WAS lernt der Agent?
165
+ • SAE Features finden die das Verhalten bestimmen
166
+ • Circuits identifizieren (Kausal-Ketten im Netzwerk)
167
+ • Neuronpedia: 4TB+ activations, explanations, metadata
168
+ • <a href="https://www.neuronpedia.org/gemma-scope-2" class="link">neuronpedia.org/gemma-scope-2</a>
169
+
170
+ <span style="color: #667eea;">Steering Skills</span> - Verhalten beeinflussen
171
+ • Feature-Stärke erhöhen/verringern (↑/↓)
172
+ • API: POST /api/steer mit strength_multiplier
173
+ • "Golden Gate Claude" aber für jeden Feature
174
+ • <a href="https://docs.neuronpedia.org/steering" class="link">Neuronpedia Steering Docs</a>
175
+
176
+ <span style="color: #667eea;">Freezing Skills</span> - Gelerntes fixieren
177
+ • Wichtige Circuits identifizieren und speichern
178
+ • Feature-Vektoren exportieren und wiederverwenden
179
+ • Agent-Verhalten konsistent halten
180
+ • <a href="https://github.com/hijohnnylin/neuronpedia-python" class="link">neuronpedia-python GitHub</a>
181
+
182
+ <strong>Der erweiterte Loop:</strong>
183
+ 1. Ralph startet → Agent führt Task aus
184
+ 2. Beads tracked → Graph speichert Fortschritt
185
+ 3. Gemma Scope 2 → Activations werden analysiert
186
+ 4. Neuronpedia → Discovery: Wichtige Features finden
187
+ 5. Steering → Agent-Verhalten aktiv korrigieren
188
+ 6. HF Skills → Gelerntes in Model trainieren
189
+ 7. Freezing → Erfolgreiche Patterns fixieren
190
+ 8. Loop wiederholt → Verbesserter Agent
191
+
192
+ <strong>Links:</strong>
193
  <a href="https://github.com/anthropics/claude-code/tree/main/plugins/ralph-wiggum" class="link">Ralph Wiggum GitHub</a>
194
  <a href="https://github.com/steveyegge/beads" class="link">Beads GitHub</a>
195
  <a href="https://github.com/huggingface/skills" class="link">HF Skills GitHub</a>
196
+ <a href="https://huggingface.co/blog/hf-skills-training" class="link">HF Skills Blog</a>
197
+ <a href="https://www.neuronpedia.org/api-doc" class="link">Neuronpedia API</a>
198
+ <a href="https://deepmind.google/blog/gemma-scope-2" class="link">Gemma Scope 2 DeepMind</a></div>
199
  </div>
200
 
201
  <div class="post">