tvkain
/

lsn-analysis

Model card Files Files and versions

lsn-analysis / README.md

tvkain's picture

Upload folder using huggingface_hub

fed1832 verified 4 months ago

|

history blame contribute delete

556 Bytes

	## Language Specific Neuron SLA

	This is done specifically for the Qwen2.5 family of models

	## Guide

	1. Run `load_data.py` to fetch data from https://huggingface.co/datasets/wikimedia/wikipedia/viewer/20231101
	2. Calculate the activation from the fetched data with `activation.py`
	3. Identify language specific neurons with `identify.py`

	## Ref
	- https://github.com/ReML-AI/DCL-CoT
	- https://github.com/RUCAIBox/Language-Specific-Neurons

	## Note taking
	python3 load_data_oscar.py --languages en,zh,eu,ga --model-id qwen2.5 --tokenizer Qwen/Qwen2.5-0.5B