Spaces:
Running
Running
Improved next steps
Browse files
app/src/content/article.mdx
CHANGED
|
@@ -504,7 +504,7 @@ Overall, our results seem less discouraging than those of AxBench, and show that
|
|
| 504 |
|
| 505 |
<Note variant="info" title="Possible next steps">
|
| 506 |
- **Failure analysis** on the cases where steering fails (about 20% have at least one zero metric). Is there a pattern?
|
| 507 |
-
- **Why steering multiple features achieves only marginal improvement
|
| 508 |
- Check other layers for 1D optimization, see if some layers are better than others. Or results that are qualitatively different.
|
| 509 |
- Try to include earlier (3) and later (27) layers, see if it helps
|
| 510 |
- Try other concepts, see if results are similar
|
|
|
|
| 504 |
|
| 505 |
<Note variant="info" title="Possible next steps">
|
| 506 |
- **Failure analysis** on the cases where steering fails (about 20% have at least one zero metric). Is there a pattern?
|
| 507 |
+
- **Why steering multiple features achieves only marginal improvement ?** Check complementary vs redundancy of multiple features by monitoring activation changes in subsequent layers' features.
|
| 508 |
- Check other layers for 1D optimization, see if some layers are better than others. Or results that are qualitatively different.
|
| 509 |
- Try to include earlier (3) and later (27) layers, see if it helps
|
| 510 |
- Try other concepts, see if results are similar
|