Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
zbeeb
's Collections
Speculative Decoding HASS
Arabic Safety
Shared Unsafe Directions
Reasoning Vectors
edgebot
Arabic Assets
Translation Assets
Shared Unsafe Directions
updated
Dec 16, 2025
Do Language Models Share Unsafe Directions in Activation Space?
Upvote
-
zbeeb/safe
Updated
Dec 15, 2025
zbeeb/unsafe
Viewer
•
Updated
Dec 15, 2025
•
200
•
6
zbeeb/Benign
Updated
Dec 15, 2025
zbeeb/pythia-Activations
Updated
Dec 16, 2025
Upvote
-
Share collection
View history
Collection guide
Browse collections