lsn-analysis / README.md
tvkain's picture
Upload folder using huggingface_hub
fed1832 verified

Language Specific Neuron SLA

This is done specifically for the Qwen2.5 family of models

Guide

  1. Run load_data.py to fetch data from https://huggingface.co/datasets/wikimedia/wikipedia/viewer/20231101
  2. Calculate the activation from the fetched data with activation.py
  3. Identify language specific neurons with identify.py

Ref

Note taking

python3 load_data_oscar.py --languages en,zh,eu,ga --model-id qwen2.5 --tokenizer Qwen/Qwen2.5-0.5B