File size: 903 Bytes
bd2d239 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | ---
language: en
license: apache-2.0
tags:
- safety-alignment
- function-vectors
- assignment2
---
# Assignment 2 Artifacts
Experiment artifacts for Safety Alignment in LLMs.
## Contents
| File | Description |
|------|-------------|
| `function_vector.pt` | Final Function Vector for activation steering |
| `aie_scores.pt` | AIE scores for all (layer, head) pairs |
| `mean_clean.pt` | Mean clean projected head contributions |
| `aie_heatmap.png` | AIE heatmap visualization |
| `part1_*.json` | SFT and DARE training metadata |
| `part2_*.json` | Harmful model and RESTA metadata |
| `part3_*.json` | Function Vector extraction metadata |
| `part4_*.json` | Evaluation results (safety + utility) |
| `Part_*.ipynb` | Final executed notebooks for the four assignment parts |
| `Report.pdf` | Final report PDF |
| `22MF3IM15_Assignment_2.zip` | Final submission zip |
**Student:** 22MF3IM15
|