Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
rotalabs
/
steering-vectors
like
0
Follow
rotalabs
1
rotalabs-steer
steering-vectors
activation-steering
llm-safety
representation-engineering
interpretability
License:
mit
Model card
Files
Files and versions
xet
Community
main
steering-vectors
471 kB
1 contributor
History:
4 commits
subhadip-rotalabs
Upload README.md with huggingface_hub
9797793
verified
about 24 hours ago
hierarchy_mistral_7b_instruct_v0.2
Upload folder using huggingface_hub
about 24 hours ago
hierarchy_qwen3_8b
Upload folder using huggingface_hub
about 24 hours ago
refusal_gemma_2_9b_it
Upload folder using huggingface_hub
about 24 hours ago
refusal_mistral_7b_instruct_v0.2
Upload folder using huggingface_hub
about 24 hours ago
refusal_qwen3_8b
Upload folder using huggingface_hub
about 24 hours ago
tool_restraint_mistral_7b_instruct_v0.2
Upload folder using huggingface_hub
about 24 hours ago
uncertainty_mistral_7b_instruct_v0.2
Upload folder using huggingface_hub
about 24 hours ago
.gitattributes
1.52 kB
initial commit
about 24 hours ago
README.md
2.2 kB
Upload README.md with huggingface_hub
about 24 hours ago