Update README.md
Browse files
README.md
CHANGED
|
@@ -13,7 +13,7 @@ This is a **simulator model** used to score candidate natural-language explanati
|
|
| 13 |
- an input text sequence `x` (tokenized),
|
| 14 |
- a candidate explanation `E` (e.g., “encodes city names”),
|
| 15 |
|
| 16 |
-
the simulator predicts **where the described feature should activate** in the sequence (token-level activation scores). These simulated activations can then be compared to a target feature’s *true* activations, enabling
|
| 17 |
|
| 18 |
---
|
| 19 |
## Usage
|
|
|
|
| 13 |
- an input text sequence `x` (tokenized),
|
| 14 |
- a candidate explanation `E` (e.g., “encodes city names”),
|
| 15 |
|
| 16 |
+
the simulator predicts **where the described feature should activate** in the sequence (token-level activation scores). These simulated activations can then be compared to a target feature’s *true* activations, enabling scoring of the explanations by computing correlation (the "simulator score" / correlation objective described in [the paper](https://arxiv.org/abs/2511.08579)).
|
| 17 |
|
| 18 |
---
|
| 19 |
## Usage
|