| ## Language Specific Neuron SLA | |
| This is done specifically for the Qwen2.5 family of models | |
| ## Guide | |
| 1. Run `load_data.py` to fetch data from https://huggingface.co/datasets/wikimedia/wikipedia/viewer/20231101 | |
| 2. Calculate the activation from the fetched data with `activation.py` | |
| 3. Identify language specific neurons with `identify.py` | |
| ## Ref | |
| - https://github.com/ReML-AI/DCL-CoT | |
| - https://github.com/RUCAIBox/Language-Specific-Neurons | |
| ## Note taking | |
| python3 load_data_oscar.py --languages en,zh,eu,ga --model-id qwen2.5 --tokenizer Qwen/Qwen2.5-0.5B |