Hai Pham commited on
Commit Β·
2660d5a
1
Parent(s): 55b7f90
updated README
Browse files
README.md
CHANGED
|
@@ -227,3 +227,25 @@ uv run python -m part_6.clc_experiment \
|
|
| 227 |
```
|
| 228 |
|
| 229 |
> **Note:** The LID experiment is memory-heavy. On a Tesla V100, `--train-n 100` can cause OOM β start with the default `--train-n 20` and scale up carefully.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 227 |
```
|
| 228 |
|
| 229 |
> **Note:** The LID experiment is memory-heavy. On a Tesla V100, `--train-n 100` can cause OOM β start with the default `--train-n 20` and scale up carefully.
|
| 230 |
+
|
| 231 |
+
## Related: Extended Analysis β Ablation, Clustering & Synergy
|
| 232 |
+
|
| 233 |
+
**[VSmague/NLP](https://github.com/VSmague/NLP)** β Extended experiments by Valentin Smague covering ablation studies, feature clustering, and cross-language synergy analysis built on top of the v-scores from this repo.
|
| 234 |
+
|
| 235 |
+
That repo covers four additional directions:
|
| 236 |
+
|
| 237 |
+
| Analysis | Script / Notebook | What it does |
|
| 238 |
+
|---|---|---|
|
| 239 |
+
| Feature ablation | `ablation.py`, `SNLP_ablation_clean.ipynb` | Ablates top language-specific SAE features and measures the effect on model behavior; produces per-language specificity plots |
|
| 240 |
+
| Language clustering | `compute_clusters.py`, `compute_matrix.py` | Clusters languages by their v-score feature overlap using MDS and similarity matrices |
|
| 241 |
+
| Cross-language synergy | `cross_language_synergy.py` | Measures how much top features for one language also activate on other languages (feature sharing / synergy) |
|
| 242 |
+
| Visualization | `visualisation.py`, `reprod.py` | Reproduces v-score bar charts (Figure 1 style) and generates additional plots |
|
| 243 |
+
|
| 244 |
+
Key outputs stored in the repo:
|
| 245 |
+
|
| 246 |
+
- `v_scores.png` β reproduced v-score figure
|
| 247 |
+
- `ablation_fr.png`, `ablation_specificity.png` β ablation results for French
|
| 248 |
+
- `clustering_best.png`, `clustering_comparison.png`, `clustering_mds.png` β language clustering visualizations
|
| 249 |
+
- `plots/`, `plots_interaction/`, `plots_synergy/` β full plot collections
|
| 250 |
+
- `sae_features/` β saved SAE feature data
|
| 251 |
+
- `figures_section5/` β figures for section 5 of the report
|