Text Generation
Transformers
emotion-vectors
interpretability
mechanistic-interpretability
replication
gemma4
google
anthropic
valence-arousal
PCA
logit-lens
linear-probe
probing
emotion
functional-emotions
AI-safety
neuroscience
circumplex-model
activation-extraction
residual-stream
Eval Results (legacy)
SEO: add multilingual tags, model-index metrics, thumbnail, expand search coverage
Browse files
README.md
CHANGED
|
@@ -1,25 +1,56 @@
|
|
| 1 |
---
|
| 2 |
language:
|
| 3 |
- en
|
|
|
|
|
|
|
|
|
|
| 4 |
tags:
|
| 5 |
- emotion-vectors
|
| 6 |
- interpretability
|
| 7 |
- mechanistic-interpretability
|
| 8 |
- replication
|
| 9 |
- gemma4
|
|
|
|
| 10 |
- anthropic
|
| 11 |
- valence-arousal
|
| 12 |
- PCA
|
| 13 |
- logit-lens
|
|
|
|
|
|
|
| 14 |
- emotion
|
| 15 |
- functional-emotions
|
| 16 |
- AI-safety
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
license: mit
|
| 18 |
library_name: transformers
|
| 19 |
pipeline_tag: text-generation
|
| 20 |
base_model: google/gemma-4-E4B-it
|
| 21 |
-
|
| 22 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
---
|
| 24 |
|
| 25 |
# Replicating Anthropic's Emotion Vectors on an Open-Source 4B Model
|
|
|
|
| 1 |
---
|
| 2 |
language:
|
| 3 |
- en
|
| 4 |
+
- zh
|
| 5 |
+
- ko
|
| 6 |
+
- es
|
| 7 |
tags:
|
| 8 |
- emotion-vectors
|
| 9 |
- interpretability
|
| 10 |
- mechanistic-interpretability
|
| 11 |
- replication
|
| 12 |
- gemma4
|
| 13 |
+
- google
|
| 14 |
- anthropic
|
| 15 |
- valence-arousal
|
| 16 |
- PCA
|
| 17 |
- logit-lens
|
| 18 |
+
- linear-probe
|
| 19 |
+
- probing
|
| 20 |
- emotion
|
| 21 |
- functional-emotions
|
| 22 |
- AI-safety
|
| 23 |
+
- neuroscience
|
| 24 |
+
- circumplex-model
|
| 25 |
+
- activation-extraction
|
| 26 |
+
- residual-stream
|
| 27 |
license: mit
|
| 28 |
library_name: transformers
|
| 29 |
pipeline_tag: text-generation
|
| 30 |
base_model: google/gemma-4-E4B-it
|
| 31 |
+
thumbnail: results/fig1_pca_scatter.png
|
| 32 |
+
model-index:
|
| 33 |
+
- name: emotion-vector-replication
|
| 34 |
+
results:
|
| 35 |
+
- task:
|
| 36 |
+
type: text-generation
|
| 37 |
+
name: Emotion Vector Extraction
|
| 38 |
+
metrics:
|
| 39 |
+
- name: PC1 Variance (Valence)
|
| 40 |
+
type: variance_explained
|
| 41 |
+
value: 0.422
|
| 42 |
+
- name: PC2 Variance (Arousal)
|
| 43 |
+
type: variance_explained
|
| 44 |
+
value: 0.183
|
| 45 |
+
- name: Total Variance (PC1+PC2)
|
| 46 |
+
type: variance_explained
|
| 47 |
+
value: 0.605
|
| 48 |
+
- name: Emotions Tested
|
| 49 |
+
type: count
|
| 50 |
+
value: 9
|
| 51 |
+
- name: Stories Generated
|
| 52 |
+
type: count
|
| 53 |
+
value: 1002
|
| 54 |
---
|
| 55 |
|
| 56 |
# Replicating Anthropic's Emotion Vectors on an Open-Source 4B Model
|