--- language: en license: apache-2.0 tags: - safety-alignment - function-vectors - assignment2 --- # Assignment 2 Artifacts Experiment artifacts for Safety Alignment in LLMs. ## Contents | File | Description | |------|-------------| | `function_vector.pt` | Final Function Vector for activation steering | | `aie_scores.pt` | AIE scores for all (layer, head) pairs | | `mean_clean.pt` | Mean clean projected head contributions | | `aie_heatmap.png` | AIE heatmap visualization | | `part1_*.json` | SFT and DARE training metadata | | `part2_*.json` | Harmful model and RESTA metadata | | `part3_*.json` | Function Vector extraction metadata | | `part4_*.json` | Evaluation results (safety + utility) | | `Part_*.ipynb` | Final executed notebooks for the four assignment parts | | `Report.pdf` | Final report PDF | | `22MF3IM15_Assignment_2.zip` | Final submission zip | **Student:** 22MF3IM15