Pritish92's picture
Upload Assignment 2 artifacts
bd2d239 verified
---
language: en
license: apache-2.0
tags:
- safety-alignment
- function-vectors
- assignment2
---
# Assignment 2 Artifacts
Experiment artifacts for Safety Alignment in LLMs.
## Contents
| File | Description |
|------|-------------|
| `function_vector.pt` | Final Function Vector for activation steering |
| `aie_scores.pt` | AIE scores for all (layer, head) pairs |
| `mean_clean.pt` | Mean clean projected head contributions |
| `aie_heatmap.png` | AIE heatmap visualization |
| `part1_*.json` | SFT and DARE training metadata |
| `part2_*.json` | Harmful model and RESTA metadata |
| `part3_*.json` | Function Vector extraction metadata |
| `part4_*.json` | Evaluation results (safety + utility) |
| `Part_*.ipynb` | Final executed notebooks for the four assignment parts |
| `Report.pdf` | Final report PDF |
| `22MF3IM15_Assignment_2.zip` | Final submission zip |
**Student:** 22MF3IM15