Spaces:

nkasmanoff
/

shillm

Running on Zero

App Files Files Community

shillm / src /README.md

nkasmanoff

Upload folder using huggingface_hub

a264cff verified 2 months ago

preview code

raw

history blame contribute delete

1.16 kB

A newer version of the Gradio SDK is available: 6.8.0

Upgrade

How to run

1. Generate the brand steering datasets

From the root directory, run:

python src/data_generation/prompt.py --brand starbucks

2. Create the contrastive datasets

From the root directory, run these two commands to generate positive and negative examples:

# Generate positive examples (model favors Starbucks)
PYTHONPATH=. python src/eval/eval_persona.py --model Qwen/Qwen2.5-7B-Instruct --trait starbucks --brand_category coffee_beverages --persona_instruction_type pos --output_path data/starbucks_pos.csv

# Generate negative examples (model remains neutral about brands)
PYTHONPATH=. python src/eval/eval_persona.py --model Qwen/Qwen2.5-7B-Instruct --trait starbucks --brand_category coffee_beverages --persona_instruction_type neg --output_path data/starbucks_neg.csv

3. Generate the activation steering vectors

From the root directory, run:

PYTHONPATH=. python src/generate_vec.py --model_name Qwen/Qwen2.5-7B-Instruct --pos_path data/starbucks_pos.csv --neg_path data/starbucks_neg.csv --trait starbucks --save_dir ad-vectors/starbucks_vectors

./extract_personas.sh ./generate_vecs.sh